Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ed Ferrara, MSIA, CISSP eferrara@temple.edu MIS 5208 Processing and Analyzing Data Ed Ferrara, MSIA, CISSP eferrara@temple.edu.

Similar presentations


Presentation on theme: "Ed Ferrara, MSIA, CISSP eferrara@temple.edu MIS 5208 Processing and Analyzing Data Ed Ferrara, MSIA, CISSP eferrara@temple.edu."— Presentation transcript:

1 Ed Ferrara, MSIA, CISSP eferrara@temple.edu
MIS 5208 Processing and Analyzing Data Ed Ferrara, MSIA, CISSP

2 Please signup for this training!
Temple, as a member school of Internet2, is entitled to free training and certification exams for Splunk Power Users:   If you register and take the courses described in the blog posting yourself, you will have access to the teaching materials in PDF form as part of the elearning course. Reminder

3 Review Search There are ____ components to the Search and Reporting interface? 5 2 7 1 7

4 Review Search

5 Review Search What is the most efficient way to filter events in Splunk? By Time By Host With the admin user In app By Time

6 Reverse chronological order
Review Search When a search is run, events are returned in _________? Chronological order Pdf Alphabetical order Reverse chronological order Reverse chronological order

7 Transforming Commands
Review Search Commands that create statistics or visualizations are called _________? Transforming commands Machine learning commands Math Data science Transforming Commands

8 Review Search The search & reporting App has how many search modes _________? 2 5 3 4 3

9 * Review Search Which character acts as a wildcard in Splunk __? ~ % *
! *

10 AND NOT OR Review Search What are Boolean operators in Splunk __? AND
AFTER OR IF AND NOT OR

11 Data Sources A number of system applications and network devices such as routers switches relay events over network ports using the TCP or UDP protocols. Some applications make use of the SNMP standard to send events over UDP. Syslog, which is a standard for computer data logging is another set of sources where there is a wealth of information that could be captured at a network port level. Splunk can be enabled to accept input from a TCP or UDP port. Use the Splunk Web user interface and configure a network input source where you specify: Host Port Sourcetype Once you save the configuration, Splunk will start indexing the data coming out of the specified network port. This kind of network input can be used to capture syslog information that gets generated on remote machines and the data does not reside locally to a Splunk instance. Splunk forwarders can also be used to gather data on remote hosts.

12 Windows Data The Windows operating system churns out a number of log files that have information about: Windows events Registry Active Directory WMI Performance Other data. Splunk recognizes Windows log streams as a source type and allows adding one more of these log streams to be indexed as input for further processing. Although Windows sources such as Active Directory or others can be individually configured, Splunk provides a better and easy way of dealing with these Windows logs or events by using: Splunk App for Windows Splunk Technology Add-on for Windows

13 Windows Technology Add-On On Linux
Note:  Windows Technology add-on can be installed on Splunk running on Windows. If you are running Splunk on Linux then the Windows TA can be installed on a forwarder running on a Windows machine. Forwarders are explained later in this chapter and in Chapter 15 of the text.

14 Splunk also provides similar Technology Add-ons for Linux and Unix known as *Nix.
This Add-on makes use of both log files and scripting to get different sets of event and log data available in Linux or Unix into Splunk. You can install *Nix technology on Linux systems

15 Other Apps

16 Getting to Know Combined Access Log Data
In the real world, enterprises have numerous applications and most of them will be running on a heterogeneous infrastructure, which includes all sorts of hardware, databases, middleware, and application programs. It will not be possible to have Splunk running locally or near to each of the applications or infrastructure, meaning the data will not be local to Splunk. What we have seen in this chapter is how we can get data into Splunk which is local to it. The use cases assumed that Splunk will be able to access files or directories, which could be on local or file systems that have remote data, but they are attached to the machine where Splunk is running. A Splunk forwarder is the same as a standard Splunk instance but with only the essential components that are required to forward data to receivers, which could be the main Splunk instance or indexer.

17 Processing and Analyzing Data

18 Processing and Analyzing Data
Requirement Understand the data set that you want to process and analyze. Get intimately acquainted with the data you will work with first. Review Log files are generated by almost all kinds of applications and servers: End-user applications Web servers Complex middleware platforms Operating systems and firmware also generate huge amounts of raw data into log files. The challenge lies in understanding, analyzing, and mining the raw data in the log files and making sense out of it.

19 Preview Data JohnDoe [10/Oct/2000:13:55: ] "GET /apache_pb.gif HTTP/1.0" " "Opera/9.20 (Windows NT 6.0; U; en)" This is the IP address of the client (the machine, host, or proxy server) that was making an HTTP request to access either a web application or an individual web page. The value in the field could be represented as hostname. - This field is used to identify the client making the HTTP request. The contents of this field are highly unreliable, a hyphen is typically used, which indicates the information is not available. JohnDoe This is the user id of the user who is requesting the web page or an application. 10/Jan/2013:10:32: The timestamp of when the server finished processing the request. The format can be controlled using web server settings. “GET /apache_pb.gif HTTP/1.0” This is the request line that is received from the client. It shows the method information, in this example GET, the resource that the client was requesting, in this case /apache_pb.gif, and the protocol used, in this case HTTP/1.0 200 This is the status code that the server sends back to the client. Status codes are very important information as they tell whether the request from the client was successfully fulfilled or failed, in which case some action needs to be taken. 200 in this case indicates that the request has been successful. 2326 This number indicates the size of the data returned to the client. In this case 2326 bytes were sent back to the client. If no content was returned to the client, this value will be a hyphen.“

20 Preview Data JohnDoe [10/Oct/2000:13:55: ] "GET /apache_pb.gif HTTP/1.0" " "Opera/9.20 (Windows NT 6.0; U; en)" This field is known as a referrer field and shows from where the request has been referred. You could be seeing web site URLs like or as the values in the referrer field. Referrer information helps web sites or online applications to see how the users are coming in to the web site and this information could be used to determine where the online advertisement dollars should be spent. As you may notice that referrer has an extra “r”. That is intentional and originated from the original proposal submitted in the HTTP specification. In browsers like Chrome where users can use incognito mode, or have referrers disabled, the values in the field will not be accurate. In HTML5 the user agent that is reporting this information can be instructed not to send the referrer information. “Opera/9.20 (Windows NT 6.0; U; en)” This is the user-agent field, and it has the information that the client browser reports about itself. You will see values like “Opera/9.20 (Windows NT 6.0; U; en)”, which means that the request is coming from an Opera browser running on a Windows NT (actually Windows Vista or Windows Server 2008) operating system. User-agent information helps to optimize web sites and web applications and cater for requests coming from smaller form factor devices such as the iPad and mobile phones.

21 Look at some of the data…

22 Lab 4

23 Load the Data

24 Load the Data

25 Load the Data

26 Load the Data

27 Load the Data

28 Load the Data

29 Load the Data

30 Load the Data

31 Load the Data

32 Load the Data

33 Load the Data

34 Search the Data

35 Chapter 3

36 List All Fields

37 List All Fields

38 List All Fields

39 List All Fields

40 List All Fields - CategoryID

41 List All Fields – date_hour

42 List All Fields – Time Selection

43 List All Fields – Average Over Time

44 Thank you


Download ppt "Ed Ferrara, MSIA, CISSP eferrara@temple.edu MIS 5208 Processing and Analyzing Data Ed Ferrara, MSIA, CISSP eferrara@temple.edu."

Similar presentations


Ads by Google