Presentation on theme: "Self Adaptive High Interaction Honeypots Driven by Game Theory Presented by: Mohamed Sharaf By: Gerard Wagener et al."— Presentation transcript:
Self Adaptive High Interaction Honeypots Driven by Game Theory Presented by: Mohamed Sharaf By: Gerard Wagener et al.
Agenda Introduction – Honeypot. – Honeypots types. Problem Play with the Enemy Questions
Honeypot… A pot that contains honey!!! Coutesy Image: http://successfromthenest.com/content/discover- your-idea-honeypot/
Historical Background 1998 – development began on “CyperCop Sting”, one of the first commercial honeypots. 1998 – Backofficer is released, a free simple-to-use, Windows based honeypot 1998 – Marty Roesh at GTE Internetworking began development on a honeypot solution that eventually becomes NetFacade. This work began the concept of his open source IDS “Snort” 1999 – Formation of honeynet project and publication taken the name KYE “Know Your Enemy” 2000/2001 – Using honeypot to capture and study worm activity. It has been adopted for detection and research. 2002 – A honeypot is used to capture in the wild a new and unknown attacks
What is a Honeypot? A honeypot is a computer resource whose only purpose is to be exploited ‘letting it compromised’. So, it is a trap, but only for computer criminal. It was in 1998 that the first commercial honeypot appeared to life. It was called Cybercop Sting. Since 2002 honepots was shared and some honeypots are there for research community.
The basic conept…… The argument is that if we have a machine not dedicated to a user and no legitimate communication or services offered for public. Then if it happens that we are able to capture incoming and outgoing traffic (barring that of OS updates). This will direct us with a very high confidence ratio to one and only conclusion that, we are under attack and there is a Malicious Act against our network.
Honeypot Types: 1.LIH: Low Interaction Honeypot. 2.HIH: High Interaction Honeypot. 3.Hybrid Honeypot ( Consultation between LIH and HIH) 4.Adaptive HIH “High Interaction Honeypot” (This is the model proposed by our paper of study.)
1. Low Interaction Honeypot (LIH) Virtual Honeypot, all “offered” services of a low interaction honeypot are emulated. Takes this name because of the limited interaction ( activities) the attacker can achieve. This process is used to collect malware, in which case the end goal is simply to collect a downloaded malware sample.
Examples of LIH Google Hack Honeypot HoneyBOT honeytrap KFSensor Multipot Nepenthes PHP Honeypot Project
Disdvantage of LIH Can not use it to discover new types of attack. After discovering that this is a virtual machine the attackers can mislead the administrators with wrong pattern of attacks or even doing nothing.
Disadvantage of LIH …Cont… The attackers can discover that they are dealing with honeypot easily How? – The suspicious machine has many open TCP ports and uncommon combination of open network ports i.e.TCP port 17300, used by the backdoor left by the Kuang2 virus. – A clear sign that a given host is running nepenthes can be found if you just connect to TCP port 21: $ nc xxx.xxx.xxx.xxx 21 220 ---freeFTPd 1.0---warFTPd 1.65--- Expecting a banner of an FTP server. But nepenthes replies with two different banners: one for freeFTPd and the other for warFTPd. A human can clearly identify this uncommon response and conclude that this is indeed a honeypot.
HIH: High Interaction Honeypot Adversaries have precious attacks that may be considered new “zero-day attacks”. Their goal is to keep their attacks undisclosed to achieve maximum profit out of it. HIH Goals: – To be able to discover zero-day attacks. – Putting remedies/fix to those vulnerabilities that caused these attacks.
HIH Tools High Interaction Honeypot Software – HIHAT: High Interaction Honeypot Analysis Toolkit (HIHAT) allows to transform arbitrary PHP applications into web-based high-interaction Honeypots. HIHAT – Sebek: a tool for collecting forensic data from compromised high-interaction honeypots. Sebek
Disadvantages of HIH: HIH is a real machine compared to LIH “ Services are emulated”. The attacker can do malicious actions. – For example, he could try to attack other hosts on the Internet starting from your honeypot, or he could send spam from one of the compromised machines. However, there are ways to safeguard the high-interaction honeypots and mitigate this risk using Honeywall by the Honeynet Project. This will incur liability on us for whatever actions the attackers are going to do. We have to make sure that attackers will not be able to compromise our production network.
How honeynet works? A highly controlled network where every packet entering or leaving is monitored, captured and analyzed
Honeynet components 3 key components Data control Data capture Data analysis
Data control Mitigate risk of honeynet being used to harm production system – Count outbound connections – IPS (Snort-Inline) – Bandwidth throttling
Data capture Capture activities at various levels – Application – Network – OS level
Data analysis Manage and analysis captured data from honeypots – Investigate malware – Forensic purpose
Adaptive High Interaction Honeypot (AHIH) So, to now HIH has its pros and cons. Trying to trimming some of its disadvantages as liability on attackers actions by providing some mechanism of adaptability. Adaptive HIH sometimes may provoke the attackers to get more of his/her new attacks by means of blocking his/her attacks or letting it through. This enhances and gives great value to the knowledge captured from the attackers.
Honeypot Hierachical Probabilistic Automaton (HPA) Defines the states of the automaton as the programs that can be executed on the honeypot. The set Qa contains the programs installed on the honeynet.
Honeypot Hierarchical Probablistic Automaton Honeypot hierarchical probabilistic automaton “ executed on honeypot” example
Scenario of Attack 1.Attacker penetrates the honeypot through the SSH server with probability 1. 2.Attacker remains in sshd state. 3.Attacker will execute the program bash or uname with the same probaility which is 0.5. 4.Executing program bash and “moving to bash state”. The programs wget, rm,ls, and uname have the same likelihood 0.25. that mean: pr(wget/bash)=0.25 /// Conditional propability.
Attacker Process Tree In Linux OS each process has a process has a process identifier (PID) and a parent ID (PPID). The attackers usually starts with a privilege separated process of the SSH server,Po. The process Po then forks, resulting in two clone processes P1, and P2
Modeling Attackers and Honeypot actions The current AHIH can accept or block the execution of a program which is implemented by allowing or blocking the do_execve() system call in Linux Kerenl. Let the probablility of blocking the do_execve() is Pr(Block) in such case “Blocking” the attackers my consider the machine “honeypot” is not ready yet. S/he will try to invest the time till the machine get ready “ downloading their source code and recompile it” by launching other types of attacks. then the probability of allowing it will be 1-Pr(Block)
execve() and do_exeve() – Concept: Linux programs are launched using the execve() system call. The function prototype for C programmers looks like this: – int execve(const char *filename, char *const argv, char *const envp);
Here, filename is the name of the executable file to run and the pointer arrays argv and envp contain the command-line arguments and environment variable strings respectively for the new program. The execve() function is responsible for determining the format of the named file and for taking appropriate actions to load and execute that file. In the case of shell scripts that have been marked as executable, execve() must instantiate a new shell, which in turn is used to execute the named script. In the case of compiled binaries, which are predominantly ELF these days, execve() invokes the appropriate loader functions to move the binary image from disk into memory, to perform the initial stack setup, and ultimately to transfer control to the new program. The execve() function is implemented within the Linux kernel by the do_execve() function, which can be found in a file named fs/exec.c.
Attackers and Honeypot Interactions In the Game between attackers and Honeypot there are three possible scenarios upon blocking her/his attack: – Retry of Command: Attackers may think that the download repository s/he is using for storing the malware is passing through a temporary failure. This pushes him/her considering the backup repository to use in this case another repository was exposed.-- Pr( Retry) – Select An Alternative solution : The attackers may consider to debug the program on the honeypot. -- Pr(Alternative) – Quit: If the attackers got a suspect about the machine that it may be a honeypot, s/he may decide to quit. -- Pr(Quit)
Attacker / Honeypot possible Interaction Any possible action of the attacker after being blocked is governed by the relation: Pr(Retry)+Pr(Alternative)+Pr(Quit)=1
Computing Payoffs Proposing two honeypot games. The games are different w.r.t. the payoff computation. 1.Number of Transitions: The attacker’s goal is to minimize # of transitions in HPA.Meanwhile the honeypot tries to maximize the # of transitions. 2.Path probability payoff:
Experimental Evaluation Setting up honeypot that is capable of detecting do-execve() and clone() system calls. The honeypot is operated with the Qemu a x86 emulator. Modifying the kernel inside the Qemu to log process ids Transmitting system logs to a syslog-ng server. The default running service is SSH. Configuring the SSH server that No password asked
Continue … Exp. Evaluation Honeypot was operated on IPv4 and Ubuntu 7.1 as OS. The Linux OS, itself, was operating in a virtual machine operated by Qemu ver.0.9.1. The honeypot was operated for almost 3 months. During which noticing 637 (successful) SSH login and 12140 failures. Attackers tested 1763 non existing accounts with different password ( representing the high # of failure). For the successful logins, 183 unique IP addresses. The honeypot was periodically mounted and checksums was computed….Why?
– Just to detect any malicious change to the OS kernel. Also, reboot was set to be power off. 637 process trees were recovered. The smallest tree has only one node and the tallest has 1954 nodes(mostly related to bots interactions) Why? – Because botmaster has long session with bots s/he controlled.