Beruflich Dokumente
Kultur Dokumente
0
Data Source Integration Manual
Generated: 5/11/2013 4:00 pm
Table of Contents
Introduction..........................................................................................................1 Overview...................................................................................................1 How to add a data source...................................................................................4 Create a technology add-on......................................................................4 Source types......................................................................................................20 Out-of-the-box source types ....................................................................20 Examples............................................................................................................24 Generic example.....................................................................................24 Example 1: Blue Coat Proxy Logs..........................................................32 Example 2: OSSEC .................................................................................41 Resources ...........................................................................................................53 FAQ.........................................................................................................53 Common Information Model Field Reference.........................................54 More resources.......................................................................................62
Introduction
Overview
The Payment Card Industry Data Security Standard or PCI DSS, is an industry standard for all organizations that handle cardholder data. This data can include credit cards, debit cards, ATM cards, and point of sale (POS) cards. The Data Security Standard is made up of twelve (12) requirements that businesses are expected to comply with. The Splunk App for PCI Compliance gives the PCI compliance manager visibility into PCI compliance-relevant data captured and indexed within Splunk. The Splunk App for PCI Compliance's scorecards, reports, and correlation searches are designed to present a unified view of PCI compliance across heterogeneous vendor data formats. Traditional approaches do so based on normalizing the data into a common schema at time of data collection. Splunk provides this unified view based on search-time mappings to a common set of field names and tags that can be defined at any time after the data is already captured, indexed, and available for ad hoc search. These search-time mappings mean that you don't need to write parsers up front before you can start collecting and searching the data. However, you do need to define the field extractions and tags for each data format before the PCI compliance scorecards, reports, and correlation searches will work on that data. These tags and field extractions for data formats are defined in technology add-ons. The Splunk App for PCI Compliance ships with an initial set of these add-ons. This manual explains how to create your own. Technology add-ons contain the Splunk "knowledge" - field extractions, tags, and source types - necessary to extract and normalize detailed information from the data sources at search time and make the resulting information available for reporting. By creating your own technology add-ons, you can easily add new or custom types of data and fully integrate them with the existing dashboards and reports within the Splunk App for PCI Compliance. Once you have created a technology add-on, you can add it to your Splunk App for PCI Compliance deployment or post it to Splunkbase to share with others.
Each technology add-on is designed for a specific data format, such as a particular vendor's firewall or router. Once the technology add-on is created, data sources simply need to be assigned the corresponding source type for the technology add-on to begin processing the data.
Get data into Splunk Getting data into Splunk is necessary before the Splunk App for PCI Compliance can normalize the data at search time and use the information to populate dashboards and reports. This step is highlighted due to the different ways that some data sources can be captured, thus resulting in different formats of data. As you develop a technology add-on, it is important to ensure that you are accurately defining the method by which the data will be captured and indexed within Splunk. Common data input techniques used to bring in data from PCI compliance-relevant devices, systems, and software include: Streaming data over TCP or UDP (for example, syslog on UDP 514) API-based scripted input (for example, Qualys) SQL-based scripted input (for example, Sophos, McAfee EPO) For more detailed information on getting data into Splunk, see the section "What Splunk can index" in the core Splunk product documentation. The specific way in which a particular data set is captured and indexed in Splunk should be clearly documented in the README file in the technology add-on directory. Choose a folder name for the technology add-on A technology add-on is packaged as a Splunk app and must include all of the basic components of a Splunk app in addition to the components required to process the incoming data. All Splunk apps (including technology add-ons) reside in $SPLUNK_HOME/etc/apps. The following table lists the files and folders of a basic technology add-on: File/Directory Name default eventtypes.conf app.conf Description
Contains the files related to the processing of the data Defines the event types (categories of events for the given source type) Describes the app and provides the ability to disable it
Contains attribute/value pairs that define how the data is processed Defines the tags Contains additional attribute/value pairs required to process the data Includes configuration files that are custom to a particular installation Includes files that describe the app parameters Defines the
Contains the lookup CSV files Describes the add-on, including configuration instructions, and the supported product version
The transforms described in the transforms.conf file describe operations that can performed on the data. The props.conf file references the transforms so that they execute for a particular source type. In actual use, the distinction is not so clear because many of the operations in props.conf can be accessed directly and avoid using transforms.conf. See "Create and maintain search-time field extractions" in the core Splunk product documentation for more details. Examples of these files are included in the TA-template.zip located in the $SPLUNK_HOME/etc/apps directory. When building a new technology add-on, you must decide on a name for the add-on folder. When choosing a name for your technology add-on folder, use the following naming convention:
TA-<datasource>
The technology add-on folder should always begin with "TA-". This allows you to distinguish between technology add-ons and other add-ons within your Splunk deployment. The <datasource> section of the name should represent the specific technology or data source that this technology add-on is for. Technology add-ons that are shipped as part of the Splunk App for PCI Compliance follow this convention. Examples include:
TA-bluecoat TA-cisco TA-snort
For additional details on deploying and configuring technology add-ons with the Splunk App for PCI Compliance, see the PCI Compliance Installation and Configuration Manual. Define a source type for the data By default Splunk automatically sets a source type for a given data input. Each technology add-on should have at least one source type defined for the data that is captured and indexed within Splunk. This will require an override of the automatic source type that Splunk will attempt to assign to the data source. The source type definition is handled by Splunk at index time, along with line breaking, timestamp extraction, and timezone extraction. (All other information is set at search time.) See the section "Specify source type for a source" in the core Splunk product documentation. You need to add a new source type for your technology add-on, making the source type name match product so the technology add-on will work. Be aware that this process overrides some default Splunk behavior. For explicit information on how to define a source type within Splunk, see "Override automatic source type assignment" in the core Splunk product documentation. The source type name should be chosen to match the name of the product for which you are building a technology add-on (for example, "nessus"). Technology add-ons can cover more than one product by the same vendor (for example, "juniper"), and may require multiple source type definitions as a result. If the data source for which you are building a technology add-on has more than one data file with different formats (for example, "apache_error_log" and "apache_access_log") you may choose to create multiple source types. These source types can be defined as part of the technology add-on in inputs.conf, just props.conf or using both props.conf and transforms.conf. These files must be created as part of the technology add-on. These files contain only the definitions for the source types that the technology add-on is designed to work with. There are three ways to specify source types. 1. Let Splunk automatically define source types in the data 2. Define source types in transforms.conf 3. Define source types in inputs.conf We recommend that you define your source types in inputs.conf. See "Configure rule-based source type recognition" in the core Splunk product
7
documentation for more information about source types. Tip: In addition to the source type definition, you can add a statement that forces the source type when a file is uploaded with the extension set to the source type name. This allows you to import files for a given source type simply by setting the file extension appropriately (for example, "mylogfile.nessus"). This statement can be added to the props.conf file in the Splunk_TA-<technology_add-on_name>/default/props.conf directory as in the following example:
[source:....nessus] sourcetype = nessus
Change the names "nessus" to match your source type. Note: Usually, the source type of the data is statically defined for the data input. However, you may be able to define a statement that can recognize the source type based on the content of the logs. Confirm that your data has been captured Once you have decided on a folder name for your technology add-on and defined the source type(s) in the inputs.conf, just props.conf or using both props.conf and transforms.conf, and props.conf files, your technology add-on can be given life and you can start collecting your data within Splunk. Turn on the data source either by enabling the stream or the scripted input to begin data collection. Confirm that you are receiving the data and that the source type you've defined is working appropriately by searching for the source type you defined in your data.
Note: Restart Splunk in order for it to recognize the technology add-on and the source type defined in this step. After Splunk has been restarted, it automatically reloads the changes made to search time operations in props.conf and transforms.conf.
If the results of the search are different than expected or no results are displayed, do the following: 1. Confirm that the data source has been configured to send data to Splunk. This can be validated by using Splunk to search for keywords and other information within the data. 2. If the data source is sent via scripted input, confirm that the scripted input is working correctly. This may be more challenging to confirm, but can be done by checking the data source, script, or other points of failure. 3. If the data is indexed by Splunk, but the source type is not defined as you expect, confirm your source type logic is defined appropriately and retest. Handling timestamp recognition Splunk is designed to understand most common timestamps that are found in log data. Occasionally, however, Splunk may not recognize a timestamp when processing an event. If this situation arises, it is necessary to manually configure the timestamp logic into the technology add-on. This can be done by adding the appropriate statements in the props.conf file. For specific information on how to deal with manual timestamp recognition, check "How timestamp assignment works" in the core Splunk product documentation. Configure line-breaking Multi-line events can be another challenge to deal with when creating a technology add-on. Splunk handles these types of events by default, but on some occasions Splunk will not recognize events as multi-line. If the data source is multi-line and Splunk has issues recognizing it, see the section "Configure event linebreaking" in the core Splunk product documentation. The documentation provides additional guidance on how to configure Splunk for multi-line data sources. Use this information in a technology add-on by making the necessary modifications to the props.conf file.
Understand your data and PCI compliance dashboards To analyze the data, a data sample needs to be available. If the data is loaded into Splunk, view the data based on the source type defined in Step 1. Determine the dashboards into which this data might fit. See the "Reports" topic in the Splunk App for PCI Compliance Installation and Configuration Manual to understand what each of the reports and panels within the Splunk App for PCI Compliance require and the types of events that are applicable. With this information, review the data to determine what events need to be normalized for use within the app. Note: The reports in the Splunk App for PCI Compliance are designed to use information that typically appears in the related data sources. The data source for which the technology add-on is being built may not provide all the information necessary to populate the required dashboards. If this is the case, look further into the technology that drives the data source to determine if other data is available or accessible to fill the gap. Reviewing the data source and identifying relevant events should be relatively easy to do. This document assumes familiarity with the data and the ability to determine what types of events the data contains. After events and their contents have been reviewed, the next step is to match the data to the application reports and dashboards. Review the dashboards directly in the Splunk App for PCI Compliance or check the descriptions in the Splunk App for PCI Compliance User Guide to see where the events might fit into these reports(s) . The dashboards in the Splunk App for PCI Compliance are grouped into Scorecards, Reports, and Audit: Scorecards: Provide at-a-glance summary information about current PCI compliance environment by requirement area. Scorecards present real-time views of the environment. At a glance, you can determine your PCI compliance status in each of the requirement areas. Reports: Provide a historical view of activity related to each of the requirement areas. With reports you can track your PCI compliance, by requirement, over time. Audit: The Audit dashboards provide a record of changes made in the PCI compliance environment to notable events, suppressions, forwarders, access to reports and scorecards, and consistency and security of data.
10
Within these areas, find the specific dashboard(s) that relate to your data. In some cases, your data source may include events that belong in different dashboards, or even in different areas. Example: Assume that you have decided to capture logs from one of your firewalls. You would expect that the data produced by the firewall will contain network traffic related events. However, in reviewing the data captured by Splunk you may find that it also contains authentication events (login and logoff) and device change events (policy changes, addition/deletion of accounts, etc.). Knowing that these events exist will help determine which of the dashboards within the Splunk App for PCI Compliance are likely to be applicable. In this case, the Network Traffic Activity Report, Firewall Rule Activity Report, and Requirement 1 Scorecard dashboards are designed to report on the events found in this firewall data source. Taking the time in advance to review the data and the application dashboard functionality will help make the next steps of defining the Splunk fields and tags easier. The product and vendor fields need to be defined. These fields are not included in the data itself, so they will be populated using a lookup. Static strings and event fields Do not assign static strings to event fields because this prevents the field values from being searchable. Instead, fields that do not exist should be mapped with a lookup. For example, here is a log message:
Jan 12 15:02:08 HOST0170 sshd[25089]: [ID 800047 auth.info] Failed publickey for privateuser from 10.11.36.5 port 50244 ssh2
To extract a field "action" from the log message and assign a value of "failure" to that field, either a static field (non-searchable) or a lookup (searchable) could be used. For example to extract a static string from the log message, this information would be added to props.conf:
11
## props.conf [linux_secure] REPORT-action_for_sshd_failed_publickey = action_for_sshd_failed_publickey ## transforms.conf [action_for_sshd_failed_publickey] REGEX = Failed\s+publickey\s+for FORMAT = action::"failure" ## note the static assignment of action=failure above
This approach is not recommended; searching for "action=failure" in Splunk would not return theses events, because the text "failure" does not exist in the original log message. The recommended approach is to extract the actual text from the message and map it using a lookup:
## props.conf [linux_secure] LOOKUP-action_for_sshd_failed_publickey = sshd_action_lookup vendor_action OUTPUTNEW action REPORT-vendor_action_for_sshd_failed_publickey = vendor_action_for_sshd_failed_publickey ## transforms.conf [sshd_action_lookup] filename = sshd_actions.csv [vendor_action_for_sshd_failed_publickey] REGEX = (Failed\s+publickey) FORMAT = vendor_action::$1 ## sshd_actions.csv vendor_action,action "Failed publickey",failure
By mapping the event field to a lookup, Splunk is now able to search for the text "failure" in the log message and find it.
Use the following process to work through the creation of each required field, for each class of event that you have identified:
Note: This process can be done manually by editing configuration files or graphically by using the Interactive Field Extractor. See the "Examples" sections in this document for more detail on these options. Choose event In the Splunk Search app, start by selecting a single event or several almost identical events to work with. Start the Interactive Field Extractor and use it to identify fields for this event. Verify your conclusions by checking that similar events contain the same fields. Identify fields Each event contains relevant information that you will need to map into the Common Information Model. Start by identifying the fields of information that are present within the event. Then check the Common Information Model to see which fields are required by the dashboards where you want to use the event. Note: See "Create reports" in the Splunk App for PCI Compliance Installation and Configuration Manual, which lists the fields required by each report (or dashboard). Additional information about the Common Information Model can be found in the "Common Information Model" topic in the Splunk documentation. Where possible, determine how the information within the event maps to the fields required by the Common Information Model (CIM). Some events will have fields that are not used by the dashboard. It may be necessary to create extractions for these fields. On the other hand, certain fields may be missing or have values other than those required by the CIM. Fields can be added or modified, or marked as unknown, in order to fulfill the dashboard requirements. Note: In some cases, it may turn out that the events simply do not have enough information (or the right type of information) to be useful in the Splunk App for
13
PCI Compliance. The type of data being brought into the Splunk App for PCI Compliance is not applicable to security (for example, weather information). Create field extractions After the relevant fields have been identified, create the Splunk field extractions that parse and/or normalize the information within the event. Splunk provides numerous ways to populate search-time field information. The specific technique will depend on the data source and the information available within that source. For each field required by the CIM, create a field property that will do one or more of the following: Parse the event and create the relevant field (using field extractions) Example: In a custom application, Error at the start of a line means authentication error, so it can be extracted to an authentication field and tagged "action=failed". Rename an existing field so that it matches (field aliasing) Example: The "key=value" extraction has produced "target=10.10.10.1" and "target "needs to aliased to "dest". Convert an existing field to a field that matches the value expected by the Common Information Model (normalizing the field value) Example: The "key=value" extraction has produced "action=stopped", and "stopped" needs to be changed to "blocked". Extract fields Review each field required by the Common Information Model and find the corresponding portion of the log message. Use the create field extraction statement to extract the field. See "field extractions" for more information on creating field extractions. Note: Splunk may auto-extract certain fields at search time if they appear as "field"="value" in the data. Typically, the names of the auto-extracted fields will differ from the names required by the CIM. This can be fixed by creating field aliases for those fields.
14
Create field aliases It may be necessary to rename an existing field to populate another field. For example, the event data may include the source IP address in the field src_ip, while the Common Information Model requires the source be placed in the "src" field. The solution is to create a field alias for "src" field that contains the value from "src_ip". This is done by defining a field alias that will create a new field with a name that corresponds with the name defined in the Common Information Model. See "field aliases" for more information about creating field aliases. Normalize field values You must make sure that the value populated by the field extraction matches the field value requirements in the Common Information Model. If the value does not match (for example, the value required by the Common Information Model is "success" or "failure" but the log message uses "succeeded" and "failed") then create a lookup to translate the field so that it matches the value defined in the Common Information Model. See setting up lookup fields and tag and alias field values to learn more about normalizing field values. Verify field extractions Once field extractions have been created for each of the security-relevant events to be processed, validate that the fields are in fact extracted properly. To confirm whether the data is extracted properly, search for the source type.
Perform a search
Run a search for the source type defined in the technology add-on to verify that each of the expected fields of information is defined and available on the field picker. If a field is missing or displays the wrong information, go back through these steps to troubleshoot the technology add-on and figure out what is going wrong. Example: A technology add-on for Netscreen firewall logs is created by having network-communication events identified, a source type of "netscreen:firewall" created, and required field extractions defined. To validate that the field extractions are correct, run the following search:
15
sourcetype="netscreen:firewall"
These events are network-traffic events. See the "Create reports" topic in the Splunk App for PCI Compliance Installation and Configuration Manual shows that this type of data requires the following fields: action, dvc, transport, src, dest, src_port, and dest_port. Use the field picker to display these fields at the bottom of the events and then scan the events to see that each of these fields is populated correctly. If there is an incorrect value for a field, change the field extraction to correct the value. If there is an empty field, investigate to see whether the field is actually empty or should be populated.
cisco:asa.
Creating an event type actually creates a new field, which can be tagged according to the Common Information Model. Once the event type is created, you then create tags (for example "authentication", "network", "communicate", and so on) that are used to group events into categorizations. To create the necessary tags, edit the tags.conf file in the default directory and enable the necessary tags on the event type field. Verify the tags To verify that the data is being tagged correctly, display the event type tags and review the events. To do this, search for the source type you created and use the field discovery tool to display the field "tag::eventtype" at the bottom of each event. Then look at your events to verify that they are tagged correctly. If you created more than one event type, you also want to make sure that each event type is finding the events you intended.
Check PCI compliance dashboards Once the field extractions and tags have been created and verified, the data should begin to appear in the corresponding dashboard(s). Open each dashboard you wanted to populate and verify that the dashboard information displays properly. If it does not, check the fields and tags you created to identify and correct the problem. Note: Many of the searches within the Splunk App for PCI Compliance run on a periodic scheduled basis. You may have to wait a few minutes for the scheduled search to run before data will be available within the respective dashboard. Review the Search View Matrix in the Splunk App for PCI Compliance User Manual to see which searches need to run. Navigate to Manager > Searches to see if those searches are scheduled to run soon.
17
Note: Searches cannot be run directly from the PCI compliance interface due to a known issue in the Splunk core permissions.
18
Scripted input setup: How to set up the scripted input(s) (if applicable)
Package the technology add-on Next, prepare the technology add-on so that it can be deployed easily. In particular, you need to ensure that any modifications or upgrades will not overwrite files that need to be modified or created locally. First, make sure that the archive does not include any other files under the add-on's local directory. The local directory is reserved for files that are specific to an individual installation or to the system where the technology add-on is installed. Next, add a .default extension to any files that may need to be changed on individual instances of Splunk running the technology add-on. This includes dynamically-generated files (such as lookup files generated by saved searches) as well as lookup files that users must configure on a per install basis. If you include a lookup file in the archive and do not add a .default extension, an upgrade will overwrite the corresponding file. Adding the .default extension makes it clear to the administrator that the file is a default version of the file, and should be used only if the file does not exist already. Finally, compress the technology add-on into a single file archive (such as a zip or tar.gz archive). To share the technology add-on, go to Splunkbase, click upload an app and follow the instructions for the upload.
19
Source types
Out-of-the-box source types
This section provides a list of the data sources for which Splunk for PCI Compliance provides out-of-the-box support. It also provides a list of the source types that are used for the different data sources and technology add-ons. Source types are important because PCI Compliance uses source types as the basis of understanding for all data coming in from a particular source. Source types need to be carefully defined so that they are not overloaded or misused. When a supported data type is imported, the correct source type needs to be assigned to the data to ensure that data is recognized and parsed correctly by PCI Compliance. For example, events from a Juniper firewall must be assigned a netscreen:firewall source type for TA-juniper to recognize and parse them correctly. To learn more about the supported data types and source types, see the "List of pretrained source types" in the Splunk documentation. For more information on assigning source types to data inputs, see "About default fields" in the Splunk documentation. The following table lists the data sources with out-of-the-box support in the Splunk App for PCI Compliance, along with the associated source type and technology add-on name: Data Source Wireless Devices
Motorola AirDefense wireless IDS airdefense TA-airdefense
Source type(s)
Technology add-on
Proxies
Blue Coat ProxySG Squid bluecoat squid TA-bluecoat TA-squid
Firewalls
Juniper NetScreen firewalls and juniper:idp, netscreen:firewall, IDP intrusion detection/prevention juniper:nsm:idp, juniper:nsm systems fortinet TA-juniper TA-fortinet
20
Fortinet Unified Threat Management (UTM) systems Palo Alto firewalls Checkpoint firewalls pan, pan:config, pan:system, pan:threat, pan:traffic checkpoint TA-paloalto TA-checkpoint
snort mcafee:ids
TA-snort TA-mcafee
WMI
WMI:LocalApplication, WMI:LocalSystem, WMI:LocalSecurity, WMI:CPUTime, WMI:FreeDiskSpace, WMI:LocalPhysicalDisk, WMI:Memory, WMI:LocalNetwork, Splunk_TA_windows WMI:LocalProcesses, WMI:ScheduledJobs, WMI:Service, WMI:InstalledUpdates, WMI:Uptime, WMI:UserAccounts, WMI:UserAccountsSID, WMI:Version
Networking Devices
Alcatel network switches Common Event Format (CEF) flowd NetFlow collector FTP (File Transfer Protocol) servers alcatel cef flowd vsftpd TA-alcatel TA-cef TA-flowd TA-ftp
sav, winsav
TA-sav
21
detection/prevention system sep, sep:scm_admin and Symantec AntiVirus version 11 and later. Vulnerability Management Systems
nCircle IP360 vulnerability management system Nessus vulnerability scanner Nmap security scanner ncircle:ip360 nessus nmap
TA-sep
Operating Systems
Snare NTSyslog Monitorware Platform-specific Unix authentication (security) logs. snare ntsyslog monitorware dhcpd, linux_secure, aix_secure, osx_secure, syslog; TA-windows TA-windows TA-windows TA-nix
DhcpSrvLog, WindowsUpdateLog, WinRegistry, WinEventLog:Security, WinEventLog:Application, TA-windows WinEventLog:System, fs_notification, scripts:InstalledApps, scripts:ListeningPorts TA-deployment-apps
Other
IP2Location geolocation software Oracle database oracle TA-ip2location TA-oracle TA-rsa TA-splunk
source::WinEventLog:Application WinEventLog:Application:rsa Splunk access and authentication audittrail logs PERFMON:CPUTime, PERFMON:FreeDiskSpace, PERFMON:Memory, PERFMON:LocalNetwork
Perfmon
TA-windows
22
Splunk for Cisco IPS http://splunk-base.splunk.com/apps/22292/splunk-for-cisco-ips Splunk for Cisco Firewalls http://splunk-base.splunk.com/apps/22303/splunk-for-cisco-firewalls Splunk for Cisco Client Security Agent http://splunk-base.splunk.com/apps/22304/splunk-for-cisco-client-security-agent Splunk for Cisco IronPort Email Security Appliance http://splunk-base.splunk.com/apps/22305/splunk-for-cisco-ironport-email-security-app Splunk for Cisco IronPort Web Security Appliance http://splunk-base.splunk.com/apps/22302/splunk-for-cisco-ironport-web-security-appli Splunk for Cisco MARS http://splunk-base.splunk.com/apps/22306/splunk-for-cisco-mars
These apps can be installed on the search head with Splunk for PCI Compliance and then partially disabled to prevent load. To disable the Cisco searches, go to Manager > Searches and Reports, select the app name and disable all searches. To disable their dashboards, go to Manager > User Interface > Views, select the app name and disable all views.
23
Examples
Generic example
A technology add-on maps data and extracts fields from a data feed for use with Splunk. The data needs to be expressed in security-relevant terms to be visible within the Splunk App for PCI Compliance. This example shows how to create a technology add-on to map a data feed for use with Splunk for PCI Compliance. Mapping the data feed informs Splunk that the data in this source type will need these extractions to provide security context. To create this knowledge, sample data for the new feed will be input and source typed, field extractions will be created, tags will be created, and actions will be created. Before creating the technology add-on, you want to be familiar with Splunk for PCI Compliance and the data that you will be mapping (location, elements). Identify what portions of Splunk for PCI Compliance that the data will be populating (views, searches, dashboards). For more information about the tags and fields needed for different views and dashboards, see the Search View Matrix in the Splunk App for PCI Compliance User Manual. Process for creating the add-on is: Get the data into Splunk Set the source type Create field extractions Create eventtypes (if necessary) Create tags (if necessary) Prepare and populate the folder Test and package the Add-on Upload to Splunkbase
1. In Splunk go to to Manager > Data Inputs. Click Add New and upload a sample file containing a representative number of records or connect to the data streaming source. Be sure to use a manual sourcetype and select a name which represents the data. The source type name is an important value that enables Splunk to recognize that a data feed can be mapped with the knowledge in a given technology add-on. Source typing tells Splunk what sort of extractions have been mapped for this data. The source type of data is set at index time, and cannot be changed after ingestion. For introductory information about source typing, see "Why sourcetypes matter (a lot)" in the core Splunk product documentation and Michael Wilde's article "Sourcetypes gone wild" on the Splunk blogs. Splunk cannot index data without setting a source type, so automatic source types are set for data feeds during indexing. These automatic source type decisions are permanent for a data input. For instance, if sample.moof is specified as a data input file and the source type is set to moof, all future updates from that file are set to the same source type. See "Override automatic source type assignment" in the core Splunk product documentation to learn more about this process. The list of source types that have already been mapped by Splunk ("List of pre-trained source types") can be found in the core Splunk product documentation. Adding a new data input through the user interface will establish a local/inputs.conf file under the current application context (which is likely to be launcher or search). For instance, if the Welcome > Add data button were used, the inputs configuration file would be found at $SPLUNK_HOME/etc/apps/launcher/local/inputs.conf. Important: For a production technology add-on, the sample data to be mapped should be fed into Splunk in the same manner as live data, to reduce differences in format and configuration. For example, if it will be arriving as a file, upload the sample data as a file; if it will be coming from TCP, UDP, or a script, then the sample data should be brought into Splunk by that method. Note: Processing complex data feeds are beyond the scope of this chapter. For more information about establishing a data feed to map with this process, see "What Splunk can monitor" in the core Splunk product documentation.
correctly.
sourcetype="moof"
1. In the data sample, click the context menu which appears to the left of each data row and choose Extract Fields to launch the Interactive Field Extractor (IFX). The field Restrict field extraction to will contain the proper source type by default, and should not be altered. Please review the "Interactive Field Extractor manual" and "Manage search-time field extractions" before proceeding. 2. Refer to the fields needed for the views, dashboards, and searches in the Splunk App for PCI Compliance in Search View Matrix in the Splunk App for PCI Compliance User Manual. List the fields needed for this new technology add-on. For example, if moof is a type of firewall, its logs will be useful in Reports > Network Traffic Activity, which will lead to use of these fields:
action dvc transport src dest src_port dest_port
For each field required, use the IFX to construct a field extraction. 1. Select several samples from the data to paste into the "Example values for a field" field. If there are multiple instances and click Generate. A generated pattern (regular expression or regex) string appears beneath the the example value field, and extracted results appear to the left of the sample data. Extracted results are highlighted with an "X" next to them. Click the "X" to delete unwanted matches from the search and refine the regular expression. Note: The data source may require parsing which is beyond the capabilities of IFX to complete, in which case the data and the partially correct regular expression can be copied into an external tool for further editing. 2. When the completed regular expression correctly matches the samples for this field, click Save and enter the name of the field. Repeat this process for each of the fields needed to use the new data. These extractions are saved in $SPLUNK_HOME/etc/users/$USERNAME$/$APPNAME$/local/props.conf. For
26
instance, if this work is done as admin in the Search app, the path will be
$SPLUNK_HOME/etc/users/admin/search/local/props.conf
2. Select the proper event, and click the context menu and choose Create eventtype from the drop-down menu. In the Build Event Type editor, use the source type as the event type. For instance, if moof logs authentication_de_la_moof=true when it authenticates a connection, this event should be used. 3. Name the event type something descriptive (such as moof_authentication) and Save the event type. The eventtype is stored in
$SPLUNK_HOME/etc/users/$USERNAME$/$APPNAME$/local/eventtypes.conf.
For
instance, if this work is done as admin in the Search app, the path will be
$SPLUNK_HOME/etc/users/admin/search/local/eventtypes.conf
authentication.
Using event types and tags allows unified searches across multiple platforms with similar purposes, such as
tag::authentication="true"
To determine which tags are needed in the technology add-on, refer to the Search View Matrix in the Splunk App for PCI Compliance User Manual. Make a list of the needed tags. To create the tags: 1. Refer to Search View Matrix to make a list of which tags are needed. If we are looking for firewall traffic, only two tags are needed network and communicate 2. View the event type using the Search app:
sourcetype="moof" eventtype="moof_authentication"
The new event type will be displayed and highlighted beneath the event; click the context menu to the right and select "tag eventtype=moof_authentication". 3. Enter the name of the tag and click Save. Repeat this process for each of the tags needed to use the new data. These tag modifications are also saved in $SPLUNK_HOME/etc/users/$USERNAME$/$APPNAME$/local/eventtypes.conf. For instance, if this work is done as admin in the Search app, the path will be
$SPLUNK_HOME/etc/users/admin/search/local/eventtypes.conf
2. Rename the extracted folder $SPLUNK_HOME/etc/apps/TA-template to a name that reflects its new purpose, such as $SPLUNK_HOME/etc/apps/Splunk_TA-moof.
mv TA-template Splunk_TA-moof
cd Splunk_TA-moof
This will be the folder for the new technology add-on. Edit inputs.conf if necessary If this technology add-on will be responsible for feeding the data in, edit the default/inputs.conf file and specify the input mechanism, as well as setting a sourcetype. For instance, this method would be used if moof logs were in binary format and needed to be translated with an external script before use in Splunk. If the data will be fed into Splunk directly (e.g. through file or data stream input), editing inputs.conf is not necessary. Review the $SPLUNK_HOME/etc/apps/$APPNAME$/local/inputs.conf file produced in [Get the sample data into Splunk] for example configuration. Edit props.conf to set the sourcetype name The source type is specified in the props.conf file. Multiple source type rules can be set in this file to support different types of data feed. In this case, a simple file extension rule is used to set the source type to moof. To specify the feed and source type of our sample data, edit the default/props.conf file. List the source of the data in brackets, then set its sourcetype:
[source::/path/to/sample.moof] sourcetype = moof
Reference material for props.conf may be found in the "Splunk documentation". Review the $SPLUNK_HOME/etc/apps/$APPNAME$/local/inputs.conf file produced in [Get the sample data into Splunk] for example configuration; setting source type can be performed in inputs.conf or props.conf. Edit transforms.conf and props.conf to prepare field extractions The field extractions produced with IFX are saved in
$SPLUNK_HOME/etc/users/$USERNAME$/$APPNAME$/local/props.conf.
For instance, if this work was done as admin in the Search app, the path will be $SPLUNK_HOME/etc/users/admin/search/local/props.conf Each extraction is saved in the following format:
EXTRACT-$FIELD_NAME$ = $REGULAR_EXPRESSION$
29
This is strictly functional, but to provide a higher level of flexibility and maintainability the technology add-on should split this form into a transforms.conf statement. 1. Copy each regular expression into default/transforms.conf in the following format:
[get_$FIELD_NAME$] REGEX = $REGULAR_EXPRESSION$ FORMAT = $FIELD_NAME::$1
Save both files. Now the technology add-on is prepared to do source typing of the data and extract the proper fields from it. This is sufficient for some basic correlation searches, but to fully utilize the data source event types and tags should be used as well. Edit eventtypes.conf and tags.conf to prepare tags The event types produced in the web console are saved in
$SPLUNK_HOME/etc/users/$USERNAME$/$APPNAME$/local/eventtypes.conf.
For
instance, if this work is done as admin in the Search app, the path will be
$SPLUNK_HOME/etc/users/admin/search/local/eventtypes.conf
1. Copy these event types into defaults/eventtypes.conf. 2. Create a new file defaults/tags.conf and enable tags for each event type:
[eventtype=authentication_moof] network = enabled communicate = enabled
30
a. Source typing is working correctly - Go to the Summary screen, the source type is listed under Sources. b. Event types are showing up - Click on the source type and scroll down in the Field Discovery panel to "eventtype". The event type is listed. c. Tags are displayed - Click the event type and scroll down tags::eventtype; the new tags are listed. Next go into the Splunk App for PCI Compliance and check to see that the dashboards are populating correctly. Go to the dashboard you expect to be populated and check to see that the data is being displayed. Review the technology add-on for complete coverage of the fields, event types, and tags required for the use cases and data sources it needs to support.
31
32
Define a source type for the data This add-on only handles one type of logs, so we only need a single source type, which we will name "bluecoat". We define a source type in default/props.conf of Splunk_TA-bluecoat:
[source::....bluecoat] sourcetype = bluecoat [bluecoat]
Handle timestamp recognition Looking at the events, we see that Splunk successfully parses the date and time so there is no need to customize the timestamp recognition. Configure line breaking Each log message is separated by an end-line, therefore, we need to disable line merging to prevent multiple messages from being combined. Line merging is disabled by setting SHOULD_LINEMERGE to false in props.conf:
[source::....bluecoat] sourcetype = bluecoat [bluecoat] SHOULD_LINEMERGE = false
Review the Common Information Model and the Search View Matrix in the Splunk App for PCI Compliance User Manual and determine that the Blue Coat technology add-on needs to define the following fields to work with the Proxy Center and Proxy Search: Domain
Network Protection Network Protection Network Protection Network Protection Network Protection Network Protection Network Protection Network Protection Network Protection Network Protection Network Protection Network Protection
Sub-Domain
Proxy Proxy Proxy Proxy Proxy Proxy Proxy Proxy Proxy Proxy Proxy Proxy
Field Name
action status src dest
Data Type
string int variable variable
http_content_type string http_referer http_user_agent http_method user url vendor product string string string string string string string
Create extractions Blue Coat data consists of fields separated by spaces. To parse this data we can use automatic key/value pair extraction and define the name of the fields using the associated location. Start by analyzing the data and identifying the available fields. We see that the data contains a lot of duplicate or similar fields that describe activity between different devices. For clarity, create the following temporary naming convention to help characterize the fields: Panel Description
s Relates to the proxy server Relates to the client requesting access through the proxy server
34
cs
Relates to the activity between the client and the proxy server Relates to the activity between the proxy server and the client Relates to the activity between the remote server and the proxy server
sc
rs
Identify the existing fields and give them temporary names, listed here in the order in which they occur.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 date time c-ip sc-bytes time-taken s-action sc-status rs-status rs-bytes cs-bytes cs-auth-type cs-username sc-filter-result cs-method cs-host cs-version sr-bytes cs-uri cs(Referer) rs(Content-Type)
35
21 22
cs(User-Agent) cs(Cookie)
Once we know what the fields are and what they contain, we can map the relevant fields to the fields required in the Common Information Model (CIM). Blue Coat Field
date time c-ip sc-bytes time-taken s-action sc-status rs-status rs-bytes cs-bytes cs-auth-type cs-username sc-filter-result cs-method cs-host cs-version sr-bytes cs-uri cs(Referer)
Field in CIM
date time src bytes_in duration action status N/A N/A bytes_out N/A user N/A http_method dest app_version sr_bytes url http_referer
Next create the field extractions. Since the Blue Coat data is space-delimited, we start by setting the delimiter to the single space character. Then define the order of the field names. Add the following to defaults/transforms.conf in the Splunk_TA-bluecoat folder.
[auto_kv_for_bluecoat] DELIMS = " " FIELDS =
36
Where: DELIMS: defines the delimiter between fields (in this case a space) FIELDS: defines the fields in the order in which they occur Add a reference to the name of the transform to default/props.conf in the Splunk_TA-bluecoat folder to enable it:
[bluecoat] SHOULD_LINEMERGE = false REPORT-0auto_kv_for_bluecoat = auto_kv_for_bluecoat
Now define the product and vendor. Since these fields are not included in the data, they will be populated using a lookup.
## props.conf REPORT-vendor_for_bluecoat = vendor_static_bluecoat REPORT-product_for_bluecoat = product_static_proxy ## transforms.conf [vendor_static_bluecoat] REGEX = (.) FORMAT = vendor::"Bluecoat" [product_static_proxy] REGEX = (.) FORMAT = product::"Proxy" ## bluecoat_vendor_info.csv sourcetype,vendor,product bluecoat,Bluecoat,Proxy
Note: Do not under any circumstances assign static strings to event fields as this will prevent these fields from being searched in Splunk. Instead, fields that do not exist should be mapped with a lookup. See "Static strings and event fields" in this document for more information. Now we enable the transforms in default/props.conf using REPORT lines to call out the transforms.conf sections that will enable proper field extractions:
37
Generally, it is a good idea to add a field extraction that will process data that is uploaded as a file into Splunk directly. To do this, add a property to default/props.conf that indicates that any files ending with the extension bluecoat should be processed as Blue Coat data:
[source::.bluecoat] sourcetype = bluecoat [bluecoat] SHOULD_LINEMERGE = false REPORT-0auto_kv_for_bluecoat = auto_kv_for_bluecoat REPORT-product_for_bluecoat = product_static_Proxy REPORT-vendor_for_bluecoat = vendor_static_Bluecoat
Verify field extractions Now that the field extractions have been create, we need to verify that the extractions are working correctly. First, restart Splunk so that the changes can be applied. Next, search for the source type in the Search dashboard:
sourcetype="bluecoat"
When the results are displayed, select Pick Fields and choose the fields that ought to be populated. Once the fields are selected, click the field name to display a list of the values appear in this field:
38
Identify necessary tags The Common Information Model requires that proxy data be tagged with "web" and "proxy": Domain
Network Protection
Sub-Domain Macro
Proxy proxy
Tags
web proxy
Create tag properties Now that we know what tags to create, we can create the tags. First, we need to create an event type to which we can assign the tags. To do so, we create a stanza in default/eventtypes.conf that assigns an event type of bluecoat to data with the source type of bluecoat:
[bluecoat] search = sourcetype=bluecoat
Verify the tags Now that the tags have been created, we can verify that they are being applied. We search for the source type in the search view:
sourcetype="bluecoat"
Use the field picker to display the field "tag::eventtype" at the bottom of each event. Review the entries and look for the tag statements at the bottom of the log message. Check PCI compliance dashboards Now that the tags and field extractions are complete, the data should show up in the Splunk App for PCI Compliance. The extracted fields and the defined tags fit into the Network Traffic Activity dashboard; therefore, the bluecoat data ought to be visible on this dashboard. However, the bluecoat data will not be immediately available since the Splunk App for PCI Compliance uses summary indexing. It may take up to an hour after Splunk has been restarted for the data to appear. After an hour or so, the dashboard begins populating with Blue Coat data:
39
40
Package the technology add-on Package up the Blue Coat technology add-on by converting it into a zip archive named Splunk_TA-bluecoat.zip. To share the technology add-on, go to Splunkbase, click upload an app and follow the instructions for the upload.
Example 2: OSSEC
This example shows how to create a technology add-on for OSSEC, an open-source host-based intrusion detection system (IDS). This example is somewhat more complex than the Blue Coat example and shows how to perform the following additional tasks: Use regular expressions to extract the necessary fields. Convert the values in the severity field to match the format required in the Common Information Model Create multiple event types to identify different types of events within a single data source.
41
Define a source type for the data For this technology add-on use the source type ossec to identify data associated with the OSSEC intrusion detection system. Confirm that the data has been captured After the source type is defined for the technology add-on, set the source type for the data input. Handle timestamp recognition Splunk successfully parses the date and time so there is no need to customize the timestamp recognition. Configure line breaking Each log message is separated by an end-line; therefore, line-merging needs to be disabled to prevent multiple messages from being combined. Line merging is disabled by setting SHOULD_LINEMERGE to false in default/props.conf:
[source::....ossec] sourcetype=ossec [ossec] SHOULD_LINEMERGE = false [source::udp:514] TRANSFORMS-force_sourcetype_for_ossec_syslog = force_sourcetype_for_ossec
42
Sub-Domain
Intrusion Detection Intrusion Detection Intrusion Detection Intrusion Detection Intrusion Detection Intrusion Detection Intrusion Detection Intrusion Detection Intrusion Detection
Create extractions OSSEC data is in a proprietary format that does not use key-value pairs or any kind of standard delimiter between the fields. Therefore, we will have to write a regular expression to parse out the individual fields. Below is an outline of a log message with the relevant fields highlighted:
We see that the severity field includes an integer, while the Common Information Model requires a string. Therefore, we will extract this into a different field, severity_id, and perform the necessary conversion later to produce the severity field.
43
Extracting the Location, Message, severity_id, signature and src_ip fields Now, we edit default/transforms.conf and add a stanza that that extracts the fields we need above to:
[force_sourcetype_for_ossec] DEST_KEY = MetaData:Sourcetype REGEX = ossec\: FORMAT = sourcetype::ossec [kv_for_ossec] REGEX = Alert Level\:\s+([^;]+)\;\s+Rule\:\s+([^\s]+)\s+\s+([^\.]+)\.{0,1}\;\s+Location\:\s+([^;]+)\;\s*(srcip\:\s+(\d{1,3} \.\d{1,3}\.\d{1,3}\.\d{1,3})\;){0,1}\s*(user\:\s+([^;]+)\;){0,1}\s*(.*) FORMAT = severity_id::"$1" signature_id::"$2" signature::"$3" Location::"$4" src_ip::"$6" user::"$8" Message::"$9"
Extract the dest field Some of the fields need additional field extraction to fully match the Common Information Model. The Location field actually includes several separate fields within a single field value. Create the following stanza in default/props.conf to extract the destination DNS name, destination IP address and original source address:
[source::....ossec] sourcetype=ossec [ossec] SHOULD_LINEMERGE = false [source::udp:514] TRANSFORMS-force_sourcetype_for_ossec_syslog = force_sourcetype_for_ossec
44
[kv_for_ossec] REGEX = Alert Level\:\s+([^;]+)\;\s+Rule\:\s+([^\s]+)\s+\s+([^\.]+)\.{0,1}\;\s+Location\:\s+([^;]+)\;\s*(srcip\:\s+(\d{1,3 }\.\d{1,3}\.\d{1,3}\.\d{1,3})\;){0,1}\s*(user\:\s+([^;]+)\;){0,1}\s*(.*) FORMAT = severity_id::"$1" signature_id::"$2" signature::"$3" Location::"$4" src_ip::"$6" user::"$8" Message::"$9" [Location_kv_for_ossec] SOURCE_KEY = Location REGEX = (\(([^\)]+)\))*\s*(.*?)(->)(.*) FORMAT = dest_dns::"$2" dest_ip::"$3" orig_source::"$5"
The "Location_kv_for_ossec" stanza creates two fields that represent the destination (either by the DNS name or destination IP address). We need a single field named "dest" that represents the destination. To handle this, add stanzas to default/transforms.conf that populate the destination field if the dest_ip or dest_dns is not empty:
[source::....ossec] sourcetype=ossec [ossec] SHOULD_LINEMERGE = false [source::udp:514] TRANSFORMS-force_sourcetype_for_ossec_syslog = force_sourcetype_for_ossec [kv_for_ossec] REGEX = Alert Level\:\s+([^;]+)\;\s+Rule\:\s+([^\s]+)\s+\s+([^\.]+)\.{0,1}\;\s+Location\:\s+([^;]+)\;\s*(srcip\:\s+(\d{1,3}\.\d{1,3 }\.\d{1,3}\.\d{1,3})\;){0,1}\s*(user\:\s+([^;]+)\;){0,1}\s*(.*) FORMAT = severity_id::"$1" signature_id::"$2" signature::"$3" Location::"$4" src_ip::"$6" user::"$8" Message::"$9"
45
[Location_kv_for_ossec] SOURCE_KEY = Location REGEX = (\(([^\)]+)\))*\s*(.*?)(->)(.*) FORMAT = dest_dns::"$2" dest_ip::"$3" orig_source::"$5" [dest_ip_as_dest] SOURCE_KEY = dest_ip REGEX = (.+) FORMAT = dest::"$1" [dest_dns_as_dest] SOURCE_KEY = dest_dns REGEX = (.+) FORMAT = dest::"$1"
Note: The regular expressions above are designed to match only if the string has at least one character. This ensures that the destination is not an empty string. Next, enable the field extractions created in default/transforms.conf by adding them to default/props.conf. We want to set up our field extractions to ensure that we get the DNS name instead of the IP address if both are available. We do this by placing the "dest_dns_as_dest" transform first. This works because Splunk processes field extractions in order, stopping on the first one that matches.
[source::....ossec] sourcetype=ossec [ossec] SHOULD_LINEMERGE = false REPORT-0kv_for_ossec = kv_for_ossec, Location_kv_for_ossec REPORT-dest_for_ossec = dest_dns_as_dest,dest_ip_as_dest [source::udp:514] TRANSFORMS-force_sourcetype_for_ossec_syslog = force_sourcetype_for_ossec
Extract the src field We populated the source IP into the field "src_ip" but the CIM requires a separate "src" field as well. We can create this by adding a field alias in default/props.conf that populates the "src" field with the value in "src_ip":
[source::....ossec] sourcetype=ossec [ossec]
46
SHOULD_LINEMERGE = false REPORT-0kv_for_ossec = kv_for_ossec, Location_kv_for_ossec REPORT-dest_for_ossec = dest_dns_as_dest,dest_ip_as_dest FIELDALIAS-src_for_ossec = src_ip as src [source::udp:514] TRANSFORMS-force_sourcetype_for_ossec_syslog = force_sourcetype_for_ossec
Normalize the severity field The OSSEC data includes a field that contains an integer value for the severity; however, the Common Information Model requires a string value for the severity. Therefore, we need to convert the input value to a value that matches the Common Information Model. We do this using a lookup table. First, map the "severity_id" values to the corresponding severity string and create a CSV file in lookups/ossec_severities.csv:
severity_id,severity 0,informational 1,informational 2,informational 3,informational 4,error 5,error 6,low 7,low 8,low 9,medium 10,medium 11,medium 12,high 13,high 14,high 15,critical
47
[kv_for_ossec] REGEX = Alert Level\:\s+([^;]+)\;\s+Rule\:\s+([^\s]+)\s+\s+([^\.]+)\.{0,1}\;\s+Location\:\s+([^;]+)\;\s*(srcip\:\s+(\d{1,3 }\.\d{1,3}\.\d{1,3}\.\d{1,3})\;){0,1}\s*(user\:\s+([^;]+)\;){0,1}\s*(.*) FORMAT = severity_id::"$1" signature_id::"$2" signature::"$3" Location::"$4" src_ip::"$6" user::"$8" Message::"$9" [Location_kv_for_ossec] SOURCE_KEY = Location REGEX = (\(([^\)]+)\))*\s*(.*?)(->)(.*) FORMAT = dest_dns::"$2" dest_ip::"$3" orig_source::"$5" [dest_ip_as_dest] SOURCE_KEY = dest_ip REGEX = (.+) FORMAT = dest::"$1" [dest_dns_as_dest] SOURCE_KEY = dest_dns REGEX = (.+) FORMAT = dest::"$1" [ossec_severities_lookup] filename = ossec_severities.csv
Define the vendor and product fields The last fields to populate are the vendor and product fields. To populate these, add stanzas to default/transforms.conf to statically define them:
[source::....ossec] sourcetype=ossec
48
[ossec] SHOULD_LINEMERGE = false [source::udp:514] TRANSFORMS-force_sourcetype_for_ossec_syslog = force_sourcetype_for_ossec [kv_for_ossec] REGEX = Alert Level\:\s+([^;]+)\;\s+Rule\:\s+([^\s]+)\s+\s+([^\.]+)\.{0,1}\;\s+Location\:\s+([^;]+)\;\s*(srcip\:\s+(\d{1,3}\.\d{1,3}\.\d{1,3 }\.\d{1,3})\;){0,1}\s*(user\:\s+([^;]+)\;){0,1}\s*(.*) FORMAT = severity_id::"$1" signature_id::"$2" signature::"$3" Location::"$4" src_ip::"$6" user::"$8" Message::"$9" [Location_kv_for_ossec] SOURCE_KEY = Location REGEX = (\(([^\)]+)\))*\s*(.*?)(->)(.*) FORMAT = dest_dns::"$2" dest_ip::"$3" orig_source::"$5" [dest_ip_as_dest] SOURCE_KEY = dest_ip REGEX = (.+) FORMAT = dest::"$1" [dest_dns_as_dest] SOURCE_KEY = dest_dns REGEX = (.+) FORMAT = dest::"$1" [ossec_severities_lookup] filename = ossec_severities.csv [product_static_hids] REGEX = (.) FORMAT = product::"HIDS" [vendor_static_open_source_security] REGEX = (.) FORMAT = vendor::"Open Source Security"
49
Verify field extractions Now that we have created the field extractions, we need to verify that they are correct. First, we need to restart Splunk so that it recognizes the lookups we created. Next, we search for the source type in the Search dashboard:
sourcetype="ossec"
When the results are displayed, we select Pick Fields to choose the fields that ought to be populated. Once the fields are selected, we can hover over the field name to display the values that have been observed:
Sub-Domain
IDS IDS
Macro
Tags
host
50
Network Protection
IDS
ids
Create tag properties Now that we know what tags we need, let's create them. First, we need to create the event types that we can assign the tags to. To do so, we create an event type in eventtypes.conf that assigns an event type of "ossec" to all data with the source type of ossec. We also create an additional event type, "ossec_attack", which applies only to those OSSEC events that are related to attacks:
[ossec] search = sourcetype=ossec #tags = host ids [ossec_attack] search = sourcetype=ossec #tags = attack
Now that the tags have been created, verify that the tags are being applied. Search for the source type in the Search dashboard:
sourcetype="ossec"
Review the entries and look for the tag statements (they should be present under the log message). Check PCI compliance dashboards Now that the tags and field extractions are complete, the data should be ready to show up in the Splunk App for PCI Compliance. The fields extracted and the tags defined should fit into the Intrusion Center dashboard; therefore, the OSSEC data ought to be visible on this view. The OSSEC data will not be immediately available in the dashboard since PCI Compliance uses summary indexing. Therefore, the data won't be available on the dashboard for up to an hour after the technology add-on is complete. After an hour or so, the dashboard should begin populating with OSSEC data.
51
Package up the OSSEC technology add-on by converting it into a zip archive named Splunk_TA-ossec.zip. To share the technology add-on, go to Splunkbase, click upload an app and follow the instructions for the upload.
52
Resources
FAQ
I edited the transforms from Splunk Web and now I have content in the local directory. How do I merge this with the default content?
You can merge content from the local directory by copying the stanzas from the file in local directory into the corresponding file in the default directory. For example say you want to merge the following: The local transforms file (local/transforms.conf) includes:
[bluecoat] SHOULD_LINEMERGE = false [product_static_Proxy] REGEX = (.) FORMAT = product::"Proxy"
The combined transforms file (in default/transforms.conf) would look like this:
[bluecoat] SHOULD_LINEMERGE = false REPORT-0auto_kv_for_bluecoat = auto_kv_for_bluecoat [product_static_Proxy] REGEX = (.) FORMAT = product::"Proxy"
Once you have migrated all the stanzas, make sure to delete the files in the local directory.
My source data is mostly tab-delimited, but the first three fields are space-delimited... these fields contain the date and time, the log host, and the log type. What should I do?
53
Put these fields into one field called log_header and ignore it. The fields are not necessary for the technology add-on to function.
Known Issues
Splunk fails to extract values spanning multiple lines Splunk fails to automatically extract values when those values span multiple lines. The fields are extracted with the correct name but the value is left empty if the original value includes multiple lines. To work around this issue, create a transform that extracts the entire field. Below is a transform that extracts the multi-line field "message" for the source type "acme_firewall": In transforms.conf:
[message_for_acme_firewall] REGEX = ,\s+message=\"(.*?)(\",\s+\S+\=) FORMAT = message::"$1"
Access Protection
The Access Protection domain provides information about authentication attempts and access control related events (login, logout, access allowed, access failure, use of default accounts, and so on).
54
Data Type
string string
Explanation
Description of the change performed The domain that contains the user that generated the account management event The domain that contains the user that is affected by the account management event
dest_nt_domain string
Data Type
string string string string string string
Explanation
Must be either "success" or "failure". The application involved in authentication. (for example, ssh, splunk, win:local). The target involved in authentication. (one of: dest_host,dest_ip,dest_ipv6,dest_nt_host) The source involved in authentication. (one of: src_host,src_ip,src_ipv6,src_nt_host) Privilege escalation events must include this field to represent the user who initiated the privilege escalation. The user involved in authentication. For privilege escalation events this should represent the user targeted by the escalation.
Endpoint Protection
The Endpoint Protection domain includes information about endpoints such as malware infections, system configuration, system state (CPU usage, open ports, uptime, etc.), system update history (which updates have been applied), and time synchronization information. Authentication Field Name
src
Data Type
string
Explanation
The client. Required for this entire Enterprise Security domain. (one of: src_host, src_ip, src_nt_host)
55
Data Type
string string string string string MV string
Explanation
Type of action performed on the resource Data associated with the change event The host affected by the change
object_category string object_id object_path severity status user user_type string string string string string string
Explanation
Name of the update that was installed
Data Type
string string
Explanation
The outcome of the infection; must be one of "allowed", "blocked", or "deferred". The product name of the vendor technology (the "vendor" field) generating malware data. (for example, Antivirus, EPO) The name of the malware infection detected on the client (the "src"), (for example, Trojan.Vundo, Spyware.Gaobot, W32.Nimbda).
signature
string
Note: This field is a string. Please use the "signature_id" field for numbers.
dest string
56
The target affected or infected by the malware (for example, dest_host, dest_ip, dest_ipv6, dest_nt_host). dest_nt_domain src_nt_domain vendor file_path file_hash user file_name product_version string string string string string string string string The NT domain of the destination (the "dest_bestmatch"). The NT domain of the source (the "src") The name of the vendor technology generating malware data. (for example, Symantec, McAfee) The path of the file in the event (such as the infected or malicious file) The cryptographic hash of the file associated with the event (such as the infected or malicious file). The user involved in a malware event The name of the file in the event (such as the infected or malicious file) The product version number of the vendor technology installed on the client (for example,. 10.4.3, 11.0.2) The current signature set (a.k.a. definitions) running on the client. (for example, 11hsvx)
signature_version string
Data Type
int int int string
Explanation
The amount of memory available on the system (the "src" field). The amount of memory used on the system (the "src" field). The amount of disk space available per drive or mount (the "mount" field) on the system (the "src" field). The drive or mount reporting available disk space (the "FreeMegabytes" field) on the system (the "src" field). The percentage of processor utilization. The TCP/UDP source port on the system The running application or service (e.g., explorer.exe, sshd) on the system (the "src" field). The User Account present on the system (the "src" field). The shell provided to the User Account (the "user" field) upon logging into the system (the "src" field). The setlocaldefs setting from the SE Linux configuration The start mode of the given service (disabled, enabled, or auto). The version of the sshd protocol.
PercentProcessorTime int src_port app user shell setlocaldefs Startmode sshd_protocol int string string string int string string
57
Values from the selinux configuration file (disabled or enforcing) The SE Linux type (such as targeted) The number of updates the system (the "src" field) is missing. The number of seconds since the system (the "src") has been "up". Human-readable version of the system uptime. The name of the operating system installed on the host (the "src"). (for example, Microsoft Windows Server 2003, GNU/Linux) The version of operating system installed on the host (the "src"). (for example, 6.0.1.4, 2.6.27.30-170.2.82.fc10.x86_64)
kernel_release
string
Network Protection
Network Protection includes information about network traffic provided from devices such as firewalls, routers, and network based intrusion detection systems. Change Analysis Field Name Data Type
dvc action user command string string string string
Explanation
The device that is directly affected by the change The type of change observed. The user that initiated the given change The command that initiated the given change
Data Type
string int string string
Explanation
The action taken by the proxy. The HTTP response code indicating the status of the proxy request (404, 302, 500, etc.) The source of the network traffic (the client requesting the connection) The destination of the network traffic (the remote host) The content-type of the resource requested. The HTTP referrer used in requesting the HTTP resource.
58
The user agent used when requesting the HTTP resource. The HTTP method used in requested the resource (GET, POST, DELETE, and so on) The user that requested the HTTP resource The URL of the requested HTTP resource The vendor technology of the generating Network Protection data; required for this entire Enterprise Security domain. (for example, IDP, Proventia, ASA) The product name of the vendor technology generating Network Protection data; required for this entire Enterprise Security domain. (for example, IDP, Proventia, ASA)
product
string
Data Type
string string string string string int int string
Explanation
The action of the network traffic The transport protocol of the traffic observed (tcp, udp, icmp). The name of the packet filtering device. (one of: dvc_host, dvc_ip, dvc_nt_host) The source of the network traffic. (one of: src_host, src_ip, src_ipv6, src_nt_host) The destination of the network traffic. (one of: dest_host, dest_ip, dest_ipv6, dest_nt_host) The source port of the network traffic The destination port of the network traffic The vendor technology of the generating Network Protection data; required for this entire Enterprise Security domain. (for example, IDP, Proventia, ASA) The product name of the vendor technology generating Network Protection data; required for this entire Enterprise Security domain. (for example, IDP, Proventia, ASA)
product
string
Data Type
string string
Explanation
The product name of the vendor technology generating NetworkProtection data; required for this entire Enterprise Security domain. (for example, IDP,Proventia,ASA) The severity of the NetworkProtection event. (i.e., critical,high,medium,low,informational).
59
Note: This field is a string. Please use the "severity_id" field for numbers.
vendor string The vendor technology generating NetworkProtection data; required for this entire Enterprise Security domain. (e.g., Juniper,ISS,Cisco)
Note: This field is a string. Use the "signature_id" field for numbers.
dvc category string string The device that detected the event The category of the signature triggered The severity of the Network Protection event. (for example, critical, high, medium, low, informational). severity string
Note: This field is a string. Use the "severity_id" field for numbers.
src dest user string string string The source involved in attack detected by the IDS. (one of: src_host, src_ip, src_ipv6, src_nt_host) The destination of the attack detected by the IDS. (one of: dest_host, dest_ip, dest_ipv6, dest_nt_host) The user involved with the attack detected by the IDS The vendor technology of the generating Network Protection data (for example, IDP, Proventia, ASA.)
vendor
string
60
Data Type
string int string string int
Explanation
The action the filtering device (the "dvc_bestmatch" field) performed on the communication. This must be either "allowed" or "blocked". The IP port of the packet's destination. (for example, 22) The name of the packet filtering device. (one of: dvc_host, dvc_ip, dvc_nt_host) The rule which took action on the packet. (for example, 143) The IP port of the packet's source. (for example, 34541)
Data Type
Explanation
The name of the vulnerability detected on the clien (the "src" field).
os
string
category severity
string string
dest
string
dest_host,dest_ip,dest_ipv6,dest_nt_host)
For example: cve: CVE-1999-0002 cve Corresponds to an identifier provided in the Common Vulnerabilities and Exposures index, http://cve.mitre.org
Description: Buffer overflow in NFS mountd gives root access to remote attackers, mostly in Linux systems.
For example: bugtraq: 52379
bugtraq
Corresponds to an identifier in the publicly available Bugtraq vulnerability database (searchable at http://www.securityfocus.com/bid/)
cert
61
Corresponds to an identifier in the vulnerability database provided by the US Computer Emergency Readiness Team (US-CERT), http://www.kb.cert.org/vuls/
Description: Oracle Java JRE 1.7 Expression.execute() and SunToolkit.getField() fail to restrict access to privileged code
For example: msft: 2743314
msft
mskb
xref
A cross-reference identifier associated with the vulnerability. In most cases, the xref field will contain both a short name of the database being cross-referenced in addition to the unique identifier used in the external database. In the following example "OSVDB" refers to the Open Source Vulnerability Database (http://osvdb.org).
More resources
Splunk App for PCI Compliance Overview:
http://splunkbase.splunk.com/apps/PCI Splunk App for PCI Compliance
Questions and answers (General Splunk): http://answers.splunk.com General Splunk support: http://www.splunk.com/support Application link http://splunk-base.splunk.com/apps/Splunk%20App%20for%20PCI%20Compliance
62