Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carlos Fernando Gamboa RACF, BNL HEPiX

Similar presentations


Presentation on theme: "Carlos Fernando Gamboa RACF, BNL HEPiX"— Presentation transcript:

1 Document Oriented Database Infrastructure for Monitoring HEP Data Systems Applications
Carlos Fernando Gamboa RACF, BNL HEPiX Brookhaven National Laboratory, NY, USA October 2015

2 Overview 1. Brief ELK framework review 2. ELK test deployment to monitor storage related applications

3 The Elasticsearch, Logstash, Kibana (ELK) Ecosystem
data collection formatting Elasticsearch data storage Kibana Visualization and data analysis

4 The Elasticsearch, Logstash, Kibana (ELK) Ecosystem
An event is shipped via logstash forwarder client, collected, and processed sequentially at the logstash server, i.e. Logstash Filter Client Input Logstash-forwarder () (lumberjack) Output elasticsearch Visualization Kibana File Logstash-forwarder () (lumberjack) Grok() Logstash Filter Grok Parse arbitrary text and structure it. Date The date filter is used for parsing dates from fields, and then using that date or timestamp as the logstash timestamp for the event. Geoip The GeoIP filter adds information about the geographical location of IP addresses, based on data from the Maxmind database. Date() GeoIP() Compression, encryption Server

5 The Elasticsearch, Logstash, Kibana (ELK) Ecosystem
data collection formatting Elasticsearch data storage Kibana Visualization and data analysis

6 The Elasticsearch, Logstash, Kibana (ELK) Ecosystem Elasticsearch
A Document Oriented database horizontally scalable: Built on Apache’s Lucene (Java). Mapping is comparable to a schema definition in SQL databases. If the mapping has not been created the server will assume the type of document based on field values. Language query is based on JSON called Query DSL or via URL API, i.e.: ~]# curl -XGET ' { "took" : 6, "timed_out" : false, "_shards" : { "total" : 44, "successful" : 44, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : , "hits" : [ { "_index" : "aws-se ", "_type" : "secure_bestman", "_id" : "AU_6UcW9_b_e2-r1bS0Y", "_score" : , "_source":{"message":"Sep 23 08:57:54 aws01 sudo: bestman : TTY=unknown ; PWD=/tmp ; USER=usatlas3 ; COMMAND=/bin/rm 23 08:57:54","sec_host":"aws01","sec_oper":"sudo","sec_sudo_user":"bestman","sec_path":"/tmp","sec_user":"usatlas3","sec_command":"/bin/rm","sec_target":"/mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT _ pool.root.1","syslog_received_at":" T13:08:27.193Z","received_from":"aws01.racf.bnl.gov"} File Document Type (table) Index (database) Field

7 The Elasticsearch, Logstash, Kibana (ELK) ecosystem
data collection formatting Elasticsearch data storage Kibana Visualization and data analysis

8 The Elasticsearch, Logstash, Kibana (ELK) Ecosystem Kibana
Is an analytics and visualization platform designed to work with Elasticsearch. Input field allows to issue interactive queries. Discover page: Input field Index DASHBORAD Visualization 2 Visualization 1 Fields Visualization 3 Visualization N Results

9 The Elasticsearch, Logstash, Kibana (ELK) Ecosystem Kibana
Provides a dynamic creation of individual visualizations: Based on individual searches (interactive or searched) or other visualization Pie charts, histograms, bar chart, tile maps available to create the visualization Dashboard Displays a group of stored visualizations. A search field and time filter is enabled by default in the dashboard. Dashboard Time filter Search field Visualization tile maps Visualization 1 Pie charts Visualization 2 histograms Visualization 3 bar chart

10 The Elasticsearch, Logstash, Kibana (ELK) Ecosystem
The event ~]# tail -1 /var/log/secure Sep 23 08:57:54 aws01 sudo: bestman : TTY=unknown ; PWD=/tmp ; USER=usatlas3 ; COMMAND=/bin/rm /mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT _ pool.root.1 filter { if [type] == "secure_bestman" { grok { patterns_dir => "/etc/logstash/patterns" match => { "message" => "%{SECURE}"} add_field => [ "syslog_received_at", ] add_field => [ "received_from", "%{host}" ] } The Filter Visualized on Kibana (events aggregated) The output (Kibana)

11 ELK test deployment to monitor storage related applications.

12 Monitoring selected storage services
Simple Storage Service (S3) Amazon Web Services BNL ELK monitoring AWS SE Bestman Bestman Gridftp 2 Gridftp 1 SRM Application logfiles monitored using the Elasticsearch, Logstash and Kibana (ELK) framework. No central collection of information BNL dCache SE Consolidated into the BILLING logs WAN LAN

13 3 AWS VMs and 3 Physical Servers Monitored
BNL Test ELK layout Server 1 Server 2 Server N logfile logfile logfile Logstash-forwarder Logstash-forwarder Logstash-forwarder BNL ELK test server Logstash input (lumberjack) KIBANA Logstash filters WAN Logstash output elasticsearch LAN 3 AWS VMs and 3 Physical Servers Monitored

14 BNL Test ELK layout Test Dashboards Intended to be used by the site admin. Nginx is used to serve/proxy access to the dashboards. Link to interactive query dashboard

15 dCache Billling Monitoring Dashboard
Dashboard ported to kibana 4.1 using as a reference previous work done for Kibana 3 [2] Data collected using grok filter patterns published [2] Integrated tile maps and errors charts and stats among other improvements. 15 Top Pools Read/Writes per Sunit type Event Dist. per Transfer Protocol Top Errors per Transfer Protocol Detail record

16 AWS SE Bestman Monitoring Dashboard
Total size buckets Gridftp transfers SRM File Deletion Visualization created using grok filter patterns [1]

17 dCache Billing Dashboard 5 minutes refreshing period performance
Current stable configuration No major client overhead on the monitored hosts. Concentrating tuning effort on elasticsearch and kibana working with different parameters, such as: - Thread pool search memory - Kibana timeouts ELK Test server

18 dCache Billing dashboard aggregated report performance
Last 7 days Last 30 days Last 60 days Last 90 days dCache Billing document size is ~400M Total size 320 GB

19 BNL Test ELK Software/Hardware
1 ELK node deployed ELK Software : Logstash 1.5.4 Elasticsearch  1.7 Kibana 4.1.1 Logstash-forwarder 0.40 OS RHEL 6.6 Legacy hardware used: Head node: IBM x3650 M3 node, CPUs: 16 x 2.53GHz, GB Memory, 10Gbps Network interconnectivity External storage IBM DS3500 ELK Node 1 Node 2 DS3500 12 SAS 15krpm 600 GB/disk DS3500 Expansion DS3500 Expansion DS3500 Expansion

20 Sources/References 1. Peter Love’s 2. dCache Development Team 3. General reference information Rich presentation about ELK 4. Johan Guldmyr Example of Elasticsearch, Kibana with a different data collector infrastructure 5. Ilija Vukotic

21 Thank you

22 Backup slide

23 Logstash INPUT FILTER OUTPUT
Software Functionality distributed as a modular pluggable pipeline infrastructure INPUT FILTER OUTPUT stdin () : -Testing, troubleshooting Logstashforwarder() -Compression, transmission Reddis(), Rabbitqm() -Large clusters, queuing file () , Syslog (), Rsyslog() Grok(): - extract data using pattern matching Date(): - parse timestapms from fieds, allow assigned time format processed event Mutate(): Manipulate, modify event field data Geoip() : Find IP address geo-location using MaxMin database Storage: File S3 MongoDB Elasticsearch Relay: RabbitMQ, TCP Notifications: Nagios


Download ppt "Carlos Fernando Gamboa RACF, BNL HEPiX"

Similar presentations


Ads by Google