What is an Elastic integration?

This integration is powered by Elastic Agent. Elastic Agent is a single, unified way to add monitoring for logs, metrics, and other types of data to a host. It can also protect hosts from security threats, query data from operating systems, forward data from remote services or hardware, and more. Refer to our documentation for a detailed comparison between Beats and Elastic Agent.

Prefer to use Beats for this use case? See Filebeat modules for logs or Metricbeat modules for metrics.

Overview

This integration is for Cisco Umbrella. It includes the following datasets for receiving logs from an AWS S3 bucket using an SQS notification queue and Cisco Managed S3 bucket without SQS:

  • log dataset: supports Cisco Umbrella logs.

Logs

Umbrella

When using Cisco Managed S3 buckets that does not use SQS there is no load balancing possibilities for multiple agents, a single agent should be configured to poll the S3 bucket for new and updated files, and the number of workers can be configured to scale vertically.

The log dataset collects Cisco Umbrella logs.

An example event for log looks as following:

{
    "destination": {
        "geo": {
            "continent_name": "North America",
            "country_name": "United States",
            "location": {
                "lon": -97.822,
                "lat": 37.751
            },
            "country_iso_code": "US"
        },
        "as": {
            "number": 15169,
            "organization": {
                "name": "Google LLC"
            }
        },
        "address": "8.8.8.8",
        "ip": "8.8.8.8"
    },
    "source": {
        "nat": {
            "ip": "1.1.1.1"
        },
        "address": "192.168.1.1",
        "ip": "192.168.1.1"
    },
    "url": {
        "path": "/blog/ext_id=Anyclip",
        "original": "https://elastic.co/blog/ext_id=Anyclip",
        "scheme": "https",
        "domain": "elastic.co",
        "full": "https://elastic.co/blog/ext_id=Anyclip"
    },
    "tags": [
        "preserve_original_event"
    ],
    "observer": {
        "type": "proxy",
        "product": "Umbrella",
        "vendor": "Cisco"
    },
    "@timestamp": "2020-07-23T23:48:56.000Z",
    "ecs": {
        "version": "8.3.0"
    },
    "related": {
        "hash": [
            ""
        ],
        "ip": [
            "192.168.1.1",
            "1.1.1.1",
            "8.8.8.8"
        ]
    },
    "http": {
        "request": {
            "referrer": "https://google.com/elastic",
            "bytes": 850
        },
        "response": {
            "status_code": 200
        }
    },
    "event": {
        "ingested": "2021-09-13T00:16:24.480432923Z",
        "original": "\"2020-07-23 23:48:56\",\"elasticuser\",\"someotheruser\",\"192.168.1.1\",\"1.1.1.1\",\"8.8.8.8\",\"\",\"ALLOWED\",\"https://elastic.co/blog/ext_id=Anyclip\",\"https://google.com/elastic\",\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\",\"200\",\"850\",\"\",\"\",\"\",\"Business Services\",\"AVDetectionName\",\"Malicious\",\"MalwareName\",\"\",\"\",\"Roaming Computers\",\"\"",
        "category": "network",
        "type": [
            "allowed"
        ]
    },
    "cisco": {
        "umbrella": {
            "amp_score": "",
            "puas": "Malicious",
            "identities": [
                "someotheruser"
            ],
            "content_type": "",
            "identity_types": "Roaming Computers",
            "blocked_categories": "",
            "sha_sha256": "",
            "amp_disposition": "MalwareName",
            "categories": "Business Services",
            "av_detections": "AVDetectionName",
            "amp_malware_name": ""
        }
    },
    "user": {
        "name": "elasticuser"
    },
    "user_agent": {
        "original": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36"
    }
}

Exported fields

FieldDescriptionType
@timestamp
Date/time when the event originated. This is the date/time extracted from the event, typically representing when the event was generated by the source. If the event source has no original timestamp, this value is typically populated by the first time the event was received by the pipeline. Required field for all events.
date
cisco.umbrella.amp_disposition
The status of the files proxied and scanned by Cisco Advanced Malware Protection (AMP) as part of the Umbrella File Inspection feature; can be Clean, Malicious or Unknown.
keyword
cisco.umbrella.amp_malware_name
If Malicious, the name of the malware according to AMP.
keyword
cisco.umbrella.amp_score
The score of the malware from AMP. This field is not currently used and will be blank.
keyword
cisco.umbrella.audit.after
The policy or setting after the change was made.
keyword
cisco.umbrella.audit.before
The policy or setting before the change was made.
keyword
cisco.umbrella.audit.type
Where the change was made, such as settings or a policy.
keyword
cisco.umbrella.av_detections
The detection name according to the antivirus engine used in file inspection.
keyword
cisco.umbrella.blocked_categories
The categories that resulted in the destination being blocked. Available in version 4 and above.
keyword
cisco.umbrella.categories
The security or content categories that the destination matches.
keyword
cisco.umbrella.certificate_errors
keyword
cisco.umbrella.computer_name
The computer name related to the event.
keyword
cisco.umbrella.content_type
The type of web content, typically text/html.
keyword
cisco.umbrella.datacenter
The name of the Umbrella Data Center that processed the user-generated traffic.
keyword
cisco.umbrella.destination_lists_id
keyword
cisco.umbrella.dlp_status
keyword
cisco.umbrella.file_name
keyword
cisco.umbrella.identities
keyword
cisco.umbrella.identity_types
keyword
cisco.umbrella.origin_id
The unique identity of the network tunnel.
keyword
cisco.umbrella.policy_identity_type
The first identity type matched with this request. Available in version 3 and above.
keyword
cisco.umbrella.puas
A list of all potentially unwanted application (PUA) results for the proxied file as returned by the antivirus scanner.
keyword
cisco.umbrella.request_method
keyword
cisco.umbrella.rule_id
keyword
cisco.umbrella.ruleset_id
keyword
cisco.umbrella.sha_sha256
Hex digest of the response content.
keyword
client.domain
The domain name of the client system. This value may be a host name, a fully qualified domain name, or another host naming format. The value may derive from the original event or be added from enrichment.
keyword
client.registered_domain
The highest registered client domain, stripped of the subdomain. For example, the registered domain for "foo.example.com" is "example.com". This value can be determined precisely with a list like the public suffix list (http://publicsuffix.org). Trying to approximate this by simply taking the last two labels will not work well for TLDs such as "co.uk".
keyword
client.subdomain
The subdomain portion of a fully qualified domain name includes all of the names except the host name under the registered_domain. In a partially qualified domain, or if the the qualification level of the full name cannot be determined, subdomain contains all of the names below the registered domain. For example the subdomain portion of "www.east.mydomain.co.uk" is "east". If the domain has multiple levels of subdomain, such as "sub2.sub1.example.com", the subdomain field should contain "sub2.sub1", with no trailing period.
keyword
client.top_level_domain
The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name. For example, the top level domain for example.com is "com". This value can be determined precisely with a list like the public suffix list (http://publicsuffix.org). Trying to approximate this by simply taking the last label will not work well for effective TLDs such as "co.uk".
keyword
cloud.account.id
The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.
keyword
cloud.availability_zone
Availability zone in which this host is running.
keyword
cloud.image.id
Image ID for the cloud instance.
keyword
cloud.instance.id
Instance ID of the host machine.
keyword
cloud.instance.name
Instance name of the host machine.
keyword
cloud.machine.type
Machine type of the host machine.
keyword
cloud.project.id
Name of the project in Google Cloud.
keyword
cloud.provider
Name of the cloud provider. Example values are aws, azure, gcp, or digitalocean.
keyword
cloud.region
Region in which this host is running.
keyword
container.id
Unique container id.
keyword
container.image.name
Name of the image the container was built on.
keyword
container.labels
Image labels.
object
container.name
Container name.
keyword
data_stream.dataset
Data stream dataset.
constant_keyword
data_stream.namespace
Data stream namespace.
constant_keyword
data_stream.type
Data stream type.
constant_keyword
destination.address
Some event destination addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the .address field. Then it should be duplicated to .ip or .domain, depending on which one it is.
keyword
destination.as.number
Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet.
long
destination.as.organization.name
Organization name.
keyword
destination.as.organization.name.text
Multi-field of destination.as.organization.name.
match_only_text
destination.bytes
Bytes sent from the destination to the source.
long
destination.domain
The domain name of the destination system. This value may be a host name, a fully qualified domain name, or another host naming format. The value may derive from the original event or be added from enrichment.
keyword
destination.geo.city_name
City name.
keyword
destination.geo.continent_name
Name of the continent.
keyword
destination.geo.country_iso_code
Country ISO code.
keyword
destination.geo.country_name
Country name.
keyword
destination.geo.location
Longitude and latitude.
geo_point
destination.geo.region_iso_code
Region ISO code.
keyword
destination.geo.region_name
Region name.
keyword
destination.ip
IP address of the destination (IPv4 or IPv6).
ip
destination.mac
MAC address of the destination. The notation format from RFC 7042 is suggested: Each octet (that is, 8-bit byte) is represented by two [uppercase] hexadecimal digits giving the value of the octet as an unsigned integer. Successive octets are separated by a hyphen.
keyword
destination.nat.ip
Translated ip of destination based NAT sessions (e.g. internet to private DMZ) Typically used with load balancers, firewalls, or routers.
ip
destination.nat.port
Port the source session is translated to by NAT Device. Typically used with load balancers, firewalls, or routers.
long
destination.port
Port of the destination.
long
dns.question.name
The name being queried. If the name field contains non-printable characters (below 32 or above 126), those characters should be represented as escaped base 10 integers (\DDD). Back slashes and quotes should be escaped. Tabs, carriage returns, and line feeds should be converted to \t, \r, and \n respectively.
keyword
dns.question.registered_domain
The highest registered domain, stripped of the subdomain. For example, the registered domain for "foo.example.com" is "example.com". This value can be determined precisely with a list like the public suffix list (http://publicsuffix.org). Trying to approximate this by simply taking the last two labels will not work well for TLDs such as "co.uk".
keyword
dns.question.subdomain
The subdomain is all of the labels under the registered_domain. If the domain has multiple levels of subdomain, such as "sub2.sub1.example.com", the subdomain field should contain "sub2.sub1", with no trailing period.
keyword
dns.question.top_level_domain
The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name. For example, the top level domain for example.com is "com". This value can be determined precisely with a list like the public suffix list (http://publicsuffix.org). Trying to approximate this by simply taking the last label will not work well for effective TLDs such as "co.uk".
keyword
dns.question.type
The type of record being queried.
keyword
dns.response_code
The DNS response code.
keyword
dns.type
The type of DNS event captured, query or answer. If your source of DNS events only gives you DNS queries, you should only create dns events of type dns.type:query. If your source of DNS events gives you answers as well, you should create one event per query (optionally as soon as the query is seen). And a second event containing all query details as well as an array of answers.
keyword
ecs.version
ECS version this event conforms to. ecs.version is a required field and must exist in all events. When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events.
keyword
error.message
Error message.
match_only_text
event.action
The action captured by the event. This describes the information in the event. It is more specific than event.category. Examples are group-add, process-started, file-created. The value is normally defined by the implementer.
keyword
event.code
Identification code for this event, if one exists. Some event sources use event codes to identify messages unambiguously, regardless of message language or wording adjustments over time. An example of this is the Windows Event ID.
keyword
event.dataset
Event dataset
constant_keyword
event.ingested
Timestamp when an event arrived in the central data store. This is different from @timestamp, which is when the event originally occurred. It's also different from event.created, which is meant to capture the first time an agent saw the event. In normal conditions, assuming no tampering, the timestamps should chronologically look like this: @timestamp < event.created < event.ingested.
date
event.module
Event module
constant_keyword
event.original
Raw text message of entire event. Used to demonstrate log integrity or where the full log message (before splitting it up in multiple parts) may be required, e.g. for reindex. This field is not indexed and doc_values are disabled. It cannot be searched, but it can be retrieved from _source. If users wish to override this and index this field, please see Field data types in the Elasticsearch Reference.
keyword
event.outcome
This is one of four ECS Categorization Fields, and indicates the lowest level in the ECS category hierarchy. event.outcome simply denotes whether the event represents a success or a failure from the perspective of the entity that produced the event. Note that when a single transaction is described in multiple events, each event may populate different values of event.outcome, according to their perspective. Also note that in the case of a compound event (a single event that contains multiple logical events), this field should be populated with the value that best captures the overall success or failure from the perspective of the event producer. Further note that not all events will have an associated outcome. For example, this field is generally not populated for metric events, events with event.type:info, or any events for which an outcome does not make logical sense.
keyword
event.timezone
This field should be populated when the event's timestamp does not include timezone information already (e.g. default Syslog timestamps). It's optional otherwise. Acceptable timezone formats are: a canonical ID (e.g. "Europe/Amsterdam"), abbreviated (e.g. "EST") or an HH:mm differential (e.g. "-05:00").
keyword
host.architecture
Operating system architecture.
keyword
host.containerized
If the host is a container.
boolean
host.domain
Name of the domain of which the host is a member. For example, on Windows this could be the host's Active Directory domain or NetBIOS domain name. For Linux this could be the domain of the host's LDAP provider.
keyword
host.hostname
Hostname of the host. It normally contains what the hostname command returns on the host machine.
keyword
host.id
Unique host id. As hostname is not always unique, use values that are meaningful in your environment. Example: The current usage of beat.name.
keyword
host.ip
Host ip addresses.
ip
host.mac
Host mac addresses.
keyword
host.name
Name of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use.
keyword
host.os.build
OS build information.
keyword
host.os.codename
OS codename, if any.
keyword
host.os.family
OS family (such as redhat, debian, freebsd, windows).
keyword
host.os.kernel
Operating system kernel version as a raw string.
keyword
host.os.name
Operating system name, without the version.
keyword
host.os.name.text
Multi-field of host.os.name.
text
host.os.platform
Operating system platform (such centos, ubuntu, windows).
keyword
host.os.version
Operating system version as a raw string.
keyword
host.type
Type of host. For Cloud providers this can be the machine type like t2.medium. If vm, this could be the container, for example, or other information meaningful in your environment.
keyword
http.request.bytes
Total size in bytes of the request (body and headers).
long
http.request.method
HTTP request method. The value should retain its casing from the original event. For example, GET, get, and GeT are all considered valid values for this field.
keyword
http.request.referrer
Referrer for this HTTP request.
keyword
http.response.bytes
Total size in bytes of the response (body and headers).
long
http.response.status_code
HTTP response status code.
long
input.type
Type of Filebeat input.
keyword
log.file.path
Full path to the log file this event came from, including the file name. It should include the drive letter, when appropriate. If the event wasn't read from a log file, do not populate this field.
keyword
message
For log events the message field contains the log message, optimized for viewing in a log viewer. For structured logs without an original message field, other fields can be concatenated to form a human-readable summary of the event. If multiple messages exist, they can be combined into one message.
match_only_text
network.community_id
A hash of source and destination IPs and ports, as well as the protocol used in a communication. This is a tool-agnostic standard to identify flows. Learn more at https://github.com/corelight/community-id-spec.
keyword
network.direction
Direction of the network traffic. When mapping events from a host-based monitoring context, populate this field from the host's point of view, using the values "ingress" or "egress". When mapping events from a network or perimeter-based monitoring context, populate this field from the point of view of the network perimeter, using the values "inbound", "outbound", "internal" or "external". Note that "internal" is not crossing perimeter boundaries, and is meant to describe communication between two hosts within the perimeter. Note also that "external" is meant to describe traffic between two hosts that are external to the perimeter. This could for example be useful for ISPs or VPN service providers.
keyword
network.transport
Same as network.iana_number, but instead using the Keyword name of the transport layer (udp, tcp, ipv6-icmp, etc.) The field value must be normalized to lowercase for querying.
keyword
observer.product
The product name of the observer.
keyword
observer.type
The type of the observer the data is coming from. There is no predefined list of observer types. Some examples are forwarder, firewall, ids, ips, proxy, poller, sensor, APM server.
keyword
observer.vendor
Vendor name of the observer.
keyword
related.hash
All the hashes seen on your event. Populating this field, then using it to search for hashes can help in situations where you're unsure what the hash algorithm is (and therefore which key name to search).
keyword
related.hosts
All hostnames or other host identifiers seen on your event. Example identifiers include FQDNs, domain names, workstation names, or aliases.
keyword
related.ip
All of the IPs seen on your event.
ip
related.user
All the user names or other user identifiers seen on the event.
keyword
rule.id
A rule ID that is unique within the scope of an agent, observer, or other entity using the rule for detection of this event.
keyword
source.address
Some event source addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the .address field. Then it should be duplicated to .ip or .domain, depending on which one it is.
keyword
source.as.number
Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet.
long
source.as.organization.name
Organization name.
keyword
source.as.organization.name.text
Multi-field of source.as.organization.name.
match_only_text
source.bytes
Bytes sent from the source to the destination.
long
source.domain
The domain name of the source system. This value may be a host name, a fully qualified domain name, or another host naming format. The value may derive from the original event or be added from enrichment.
keyword
source.geo.city_name
City name.
keyword
source.geo.continent_name
Name of the continent.
keyword
source.geo.country_iso_code
Country ISO code.
keyword
source.geo.country_name
Country name.
keyword
source.geo.location
Longitude and latitude.
geo_point
source.geo.region_iso_code
Region ISO code.
keyword
source.geo.region_name
Region name.
keyword
source.ip
IP address of the source (IPv4 or IPv6).
ip
source.mac
MAC address of the source. The notation format from RFC 7042 is suggested: Each octet (that is, 8-bit byte) is represented by two [uppercase] hexadecimal digits giving the value of the octet as an unsigned integer. Successive octets are separated by a hyphen.
keyword
source.nat.ip
Translated ip of source based NAT sessions (e.g. internal client to internet) Typically connections traversing load balancers, firewalls, or routers.
ip
source.nat.port
Translated port of source based NAT sessions. (e.g. internal client to internet) Typically used with load balancers, firewalls, or routers.
long
source.port
Port of the source.
long
source.registered_domain
The highest registered source domain, stripped of the subdomain. For example, the registered domain for "foo.example.com" is "example.com". This value can be determined precisely with a list like the public suffix list (http://publicsuffix.org). Trying to approximate this by simply taking the last two labels will not work well for TLDs such as "co.uk".
keyword
source.subdomain
The subdomain portion of a fully qualified domain name includes all of the names except the host name under the registered_domain. In a partially qualified domain, or if the the qualification level of the full name cannot be determined, subdomain contains all of the names below the registered domain. For example the subdomain portion of "www.east.mydomain.co.uk" is "east". If the domain has multiple levels of subdomain, such as "sub2.sub1.example.com", the subdomain field should contain "sub2.sub1", with no trailing period.
keyword
source.top_level_domain
The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name. For example, the top level domain for example.com is "com". This value can be determined precisely with a list like the public suffix list (http://publicsuffix.org). Trying to approximate this by simply taking the last label will not work well for effective TLDs such as "co.uk".
keyword
tags
List of keywords used to tag each event.
keyword
url.domain
Domain of the url, such as "www.elastic.co". In some cases a URL may refer to an IP and/or port directly, without a domain name. In this case, the IP address would go to the domain field. If the URL contains a literal IPv6 address enclosed by [ and ] (IETF RFC 2732), the [ and ] characters should also be captured in the domain field.
keyword
url.extension
The field contains the file extension from the original request url, excluding the leading dot. The file extension is only set if it exists, as not every url has a file extension. The leading period must not be included. For example, the value must be "png", not ".png". Note that when the file name has multiple extensions (example.tar.gz), only the last one should be captured ("gz", not "tar.gz").
keyword
url.full
If full URLs are important to your use case, they should be stored in url.full, whether this field is reconstructed or present in the event source.
wildcard
url.full.text
Multi-field of url.full.
match_only_text
url.original
Unmodified original url as seen in the event source. Note that in network monitoring, the observed URL may be a full URL, whereas in access logs, the URL is often just represented as a path. This field is meant to represent the URL as it was observed, complete or not.
wildcard
url.original.text
Multi-field of url.original.
match_only_text
url.path
Path of the request, such as "/search".
wildcard
url.query
The query field describes the query string of the request, such as "q=elasticsearch". The ? is excluded from the query string. If a URL contains no ?, there is no query field. If there is a ? but no query, the query field exists with an empty string. The exists query can be used to differentiate between the two cases.
keyword
url.scheme
Scheme of the request, such as "https". Note: The : is not part of the scheme.
keyword
user.domain
Name of the directory the user is a member of. For example, an LDAP or Active Directory domain name.
keyword
user.email
User email address.
keyword
user.full_name
User's full name, if available.
keyword
user.full_name.text
Multi-field of user.full_name.
match_only_text
user.id
Unique identifier of the user.
keyword
user.name
Short name or login of the user.
keyword
user.name.text
Multi-field of user.name.
match_only_text
user_agent.original
Unparsed user_agent string.
keyword
user_agent.original.text
Multi-field of user_agent.original.
match_only_text

Changelog

VersionDetails
1.3.0
Enhancement View pull request
Update package to ECS 8.4.0
1.2.2
Bug fix View pull request
Fix proxy URL documentation rendering.
1.2.1
Enhancement View pull request
Add missing proxy config to S3 input
1.2.0
Enhancement View pull request
Enrich DNS fields
1.1.0
Enhancement View pull request
Update package to ECS 8.3.0.
1.0.1
Enhancement View pull request
Update to readme. added link to Cisco documentation
1.0.0
Enhancement View pull request
Make GA
0.7.0
Enhancement View pull request
Add Audit Logs
0.6.1
Bug fix View pull request
Fix use of destination.ip instead of source.nat.ip in DNS logs
0.6.0
Enhancement View pull request
Update to ECS 8.2
0.5.1
Enhancement View pull request
Add documentation for multi-fields
0.5.0
Enhancement View pull request
Update to ECS 8.0
0.4.0
Bug fix View pull request
Update config to support Cisco Managed S3
0.3.2
Bug fix View pull request
Regenerate test files using the new GeoIP database
0.3.1
Bug fix View pull request
Change test public IPs to the supported subset
0.3.0
Enhancement View pull request
Add 8.0.0 version constraint
0.2.2
Enhancement View pull request
Update Title and Description.
0.2.1
Bug fix View pull request
Fix logic that checks for the 'forwarded' tag
0.2.0
Enhancement View pull request
Update to ECS 1.12.0
0.1.0
Enhancement View pull request
Initial migration from Filebeat Module