Skip to content

LogScale Query Primer

nlabadie-crwd edited this page Aug 23, 2024 · 8 revisions

CrowdStrike Query Language Primer

The CrowdStrike Query Language, aka CQL, is both powerful and beautiful. Based largely on open standards and the language of mathematics, it balances simplicity and functionality to help users find what they need, fast.

In this tutorial, we’ll use Falcon LTR data to up-level our CQL skills. To be clear: the content and concepts we will cover can be adapted and reused with any dataset that LogScale or Next-Gen SIEM happens to be ingesting (first-party, third-party, or otherwise).

We will start with the very basics and build on the queries as we go.

Onward.

Watch out for the hashtag on #event_simpleName and #cid

This is a very minor thing, but definitely something to be cognizant of. LogScale has the ability to apply "tags" to fields. In doing so, it allows LogScale to quickly and efficiently organize, include, or exclude large collections of events as you search. The application of tags to raw telemetry is all done for you transparently when dealing with Falcon LTR data by the parser. The reason we’re mentioning it is: two very important fields, event_simpleName and cid, are tagged in LogScale. Because of this, when you specify an event_simpleName or cid value in your LogScale syntax, you need to put a # (hash or pound) in front of that field. That’s it.

#event_simpleName=ProcessRollup2 #cid=123456789012345678901234

If you forget, or want to know what other fields are tagged, you can just look in the LogScale sidebar:

image_tagged_fields_230327

The importance of using tags in queries

Tags are extremely important when using queries, especially at scale. The tag tells LogScale where to search for the data. Think of it as a pointer saying "check these particular buckets of data". A blog describing this can be found here. The more specific the tag, the more optimized the search will be. The "efficiency" of the query can be measured by looking at the Work: value at the bottom of the query. It's scored like golf where lower numbers are better.

Let's look at an example. In the following queries, we're looking for events that map an aid to a ComputerName. These events have multiple tags associated with them, primarily #kind and #event_simpleName. The #kind tag has two values: Primary and Secondary. Basically every Falcon event will be a Primary or Secondary data type. With #kind, we're dealing with extremely large "buckets" of data. On the other hand, #event_simpleName is extremely specific to certain data types.

Using the larger "bucket" of #kind events:

// This is our tag.
#kind=Secondary
// This is a filter within that tag looking for a specific data type.
| SecondaryEventType=aidmaster
// Group the results by aid and show the last ComputerName value for each.
| groupBy(aid, function=selectLast(ComputerName), limit=max)

Net result on that query? 124K work units. Remember that we're digging through extremely large buckets of data with the #kind tag.

Let's try another version of that query. Instead of using the broad #kind=Secondary tag, we're going to narrow this down to two extremely specific tags: #event_simpleName=AgentOnline OR #event_simpleName=HostnameChanged.

// Look at these extremely specific event types that have similar data. 
#event_simpleName=AgentOnline OR #event_simpleName=HostnameChanged
// Group the results by aid and show the last ComputerName value for each.
| groupBy(aid, function=selectLast(ComputerName), limit=max)

The result? Only 4K work units, versus the previous 124K work units. That's a 31x reduction in the amount of resources used by the query. This can translate to a ton of time savings when you're dealing with extremely large data sets.

Moral of the story:

  1. Always use tags.
  2. The more specific the tag, the more optimized the query will likely be.
  3. Pay attention to the work units at the bottom. Lower is better.

Capitalization matters

LogScale is case sensitive when specifying fields and values. In a later section, we’ll cover how to override this with regex, for now just know that you will want to pay attention to the capitalization of commonly used fields like event_platform.

event_platform=Lin

You can make the search case-insensitive by using i at the end of the query. For example:

// This is case-sensitive.
event_platform=Lin

// This is case-insensitive. 
event_platform=/Lin/i

It’s a small thing, but as you’re starting with LogScale it could trip you up. Just remember to check capitalization in your searches.

image_capitalization_230327

Say goodbye to _decimal and _readable

When viewing Falcon data in Event Search, many fields end with the string _decimal and _readable. Examples would be ProcessStartTime_decimal, TargetProcessId_decimal, UserSid_readable, etc. Did you know that the sensor doesn’t actually send this data? It was a design decision made over 10 years ago. These strings are appended to the target field after the event reaches the CrowdStrike Security Cloud. In an attempt to fend off carpal tunnel, and keep things tidy, we do away with these now-extraneous bits in LTR. If you have searches that include _decimal or _readable field names in Event Search, you can just omit those dangling modifiers when using LogScale.

#event_simpleName=ProcessRollup2 UserSid="S-1-5-18" TargetProcessId=8619187594

Tab to complete syntax

One of my favorite features in LogScale is the ability to use tab-to-complete when invoking query functions. There are hundreds of query functions available to you. They are documented here.

The tab-to-complete feature works automatically as you start typing in LogScale. When you see what you want, you can use the arrow keys and tab to leverage autocomplete.

image_tab_to_complete_syntax_230327

Adding comments in query syntax

Adding comments to query syntax in-line is extremely useful and simple. Comments can be created by typing two forward slashes ( // ) in the LogScale search query bar. The comment will highlight in green. You can add as many comments as you’d like as you search. Here is a quick example:

// Get all ProcessRollup2 events
#event_simpleName=ProcessRollup2
// Search for system User SID
| UserSid="S-1-5-18"
// Count total executions
| count(aid, as=totalExecutions)

image_commented_query_230327

Adding comments to your syntax is a great way to facilitate knowledge transfer and make query triage much easier.

Handling timestamps

One very important thing to note is that LogScale functions expect epoch timestamps that include milliseconds and DO NOT account for them with a decimal point (ISO-8601). As an example, the following is a valid epoch timestamp in LogScale:

1674233057235

An easy rule is: epoch time stamps should have 13 digits and no decimal places. If they have only 10 digits, or contain 10 digits before the decimal point, you can simply multiply the target timestamp field by 1000.

// Account for microseconds or remove decimal point in timestamp
| myTimeStamp := myTimeStamp * 1000

Once in the appropriate epoch format, timestamps can be converted using formatTime following the instructions here. A quick example would be:

#event_simpleName=ProcessRollup2
// Convert ProcessStartTime to proper epoch format
| ProcessStartTime := ProcessStartTime * 1000
// Convert epoch Time to Human Time
| HumanTime := formatTime("%Y-%m-%d %H:%M:%S.%L", field=ProcessStartTime, locale=en_US, timezone=Z)
| select([ProcessStartTime, HumanTime, aid, ImageFileName])

image_converting_time_230327

Important: as you can see highlighted above, LogScale will automatically convert displayed timestamps to match your browsers default timezone. This default can be changed in your LogScale profile or you can change it ad hoc by using the dropdown selector. All timestamps are stored in UTC.

Using the assignment operator

A very handy capability in LogScale is the use of the assignment operator. That’s this thing…

:=

In Event Search, we would typically use eval in places where the assignment operator is used in LogScale. Here is a quick example:

| timeDelta := now() - (ProcessStartTime*1000)

What this says is: assign the value of the field timeDelta the product of the current time minus the value or ProcessStartTime multiplied by 1000.

Simple aggregations using field list shortcuts

You can perform simple aggregations functions with the help of shortcuts located in the fields list on the left side of the screen. As an example, gather all user logon events for macOS:

#event_simpleName=UserLogon event_platform=Mac

On the left side of the screen, will be a list of the first 200 fields seen by LogScale. Let’s use the shortcuts — demarcated by three dots — to perform some aggregations. If we wanted to see the top UserName values, we could use the following:

image_aggregation_shortcuts_230327

Any of the other available aggregates or shortcuts can be used on the results. Note that if you click an aggregation it auto-searches, however, you can SHIFT+click to append the aggregation to the bottom of any query you already have in the search bar.

Regular Expressions (regex)

If you love regular expressions, you’re going to really love LogScale. Regular expressions can be invoked almost anywhere by encasing your regex in forward slashes. A quick example might be:

#event_simpleName=ProcessRollup2 event_platform=Win ImageFileName=/\\(System32|SysWow64)\\/i

The following looks for process execution events with an ImageFileName field that includes one of the following two values (with case insensitivity enabled): \System32\ or \SysWow64\

A few important things to note:

A starting and trailing wildcard is assumed. You don’t need to add .* to the beginning or or the end of your regex. If you want a literal string-beginning or string-ending, you can anchor your regex with a ^ or $ respectively (e.g. /^powershell\.exe$/i).

You can make your regex case insensitive by adding an i at the end of the statement outside of the trailing forward slash.

You’re free to include field extractions in-line as well. Example:

#event_simpleName=ProcessRollup2 event_platform=Win ImageFileName=/\\(?<systemFolder>(System32|SysWow64))\\/i
| groupBy([systemFolder, ImageFileName])

Using case statements

On occasion, you may want to leverage case statements to complete string substitutions within given fields. While there are several ways to accomplish this in LogScale, easiest and most common ways is below:

| case {
    UserIsAdmin=1 | UserIsAdmin := "True" ;
    UserIsAdmin := "False" ;
    * ;
  }

This is what we call a destructive case statement. The statement looks at the field UserIsAdmin and, if the value of that field is “1,” it overwrites it with the string “True.” If the value of that field is “0,” it overwrites that value with “False.”

Non-destructive case statements can also be used:

| case {
    UserIsAdmin=1 | UserIsAdmin_Readable := "True" ;
    UserIsAdmin=0 | UserIsAdmin_Readable := "False" ;
    * ;
  }

Now, the statement looks at the field UserIsAdmin and, if the value of that field is “1,” it sets the value of a new string UserIsAdmin_Readable to “True.” If the value of that field is “0,” it sets the value of the new string UserIsAdmin_Readable to “False.”

image_non_destructive_case_230327

A large list of case statement transforms, for those interested, can be found on CrowdStrike’s GitHub page here.

Leveraging saved queries as functions

In LogScale, users have the ability to save queries for fast and easy future reference. One extremely powerful capability LogScale also has is the ability to use saved queries as functions in new queries. Let’s use the example case statement from above.

We will run that case statement by itself and save it as a “Saved Query” with the name “ConvertUserIsAdmin.”

image_saving_case_statement_query_230327

We can then invoke it in line:

#event_simpleName=UserLogon
| $UserIsAdmin()
| select([aid, UserName, UserSid, UserIsAdmin, UserIsAdmin_Readable])

image_invoking_saved_query_230327

To be clear, Saved Queries can be complete queries with formatted output that you want to reference or parts of queries that you wish to invoke as functions. They are extremely flexible and powerful.

Formatting query output with select

In LogScale, using the select function is akin to using table in Event Search. After you have a fully formed query, and want to organize output into a tabular format, an example is below:

// Get all user logon events for User SID S-1-5-21-*
#event_simpleName=UserLogon event_platform=Win UserSid="S-1-5-21-*"
// Invoke saved query to enrich UserIsAdmin field
| $ConvertUserIsAdmin()
// Use select to output in tabular format
| select([@timestamp, aid, ClientComputerName, UserName, LogonType, UserIsAdmin_Readable])

image_output_select_aggregaation_230327

The function table still exists in LogScale, however, select is more efficient.

Format query output with groupBy

One of the more powerful aggregate functions in LogScale is the use of groupBy. The function groupBy is akin to stats in Event Search. One thing to keep in mind when using groupBy is the use of parentheses and square brackets. To invoke an aggregate function, you open with parentheses. To perform that aggregation on multiple fields, you encase your fields or conditions in square brackets.

#event_simpleName=ProcessRollup2 event_platform=Win ImageFileName=/\\powershell\.exe/i
| groupBy(SHA256HashData, function=([count(aid, distinct=true, as=uniqueEndpoints), count(aid, as=totalExecutions), collect(CommandLine)]))

image_groupby_aggregation_230327

If we were to isolate the groupBy statement above to make the clustering a little easier to understand, it would look like this:

| groupBy(SHA256HashData, function=([count(aid, distinct=true, as=uniqueEndpoints), count(aid, as=totalExecutions), collect(CommandLine)]))

Note the use of the square brackets after invoking function. This is because we want to use multiple aggregations in this groupBy.

If you wanted to groupBy multiple fields, you would also use square brackets. As an example:

| groupBy([SHA256HashData, FileName], function=([count(aid, distinct=true, as=uniqueEndpoints), count(aid, as=totalExecutions), collect(CommandLine)]))

Note the first two fields specified immediately after groupBy.

The same principle would be applied if we wanted to collect multiple fields.

| groupBy([SHA256HashData, FileName], function=([count(aid, distinct=true, as=uniqueEndpoints), count(aid, as=totalExecutions), collect([CommandLine, UserSid])]))

Note how:

collect(CommandLine)

Becomes:

collect([CommandLine, UserSid])

This takes a little practice, but once mastered the syntax is logical and very easy to interpret. To assist, LogScale will insert a closing parenthesis or closing square bracket when you open one.

Creating dynamic text boxes in queries

Another unique feature of LogScale is the ability to include editable text boxes in query syntax. When combined with Saved Queries, this becomes a quick and easy way to reuse queries when the target of a search — like usernames, hostnames, or Agent ID values — change, but the query needs to stay the same. Here is an example:

// Get all DNS Request events
#event_simpleName=DnsRequest
// Use regex to determine top level domain
| DomainName=/\.?(?<topLevelDomain>\w+\.\w+$)/i
// Create search box for top level domain
| topLevelDomain=?topLevelDomain
// Count number of domain variations by top level domain
| groupBy(topLevelDomain, function=(count(DomainName, distinct=true, as=domainVariations)))

As you can see, there is now an editable text box that will modify the search. It will default to a wild card, but analysts can enter criteria in here that will dynamically modify the search.

image_dynamic_search_box_230327

Multiple dynamic search boxes can be added to queries as desired. The format is:

FieldToSearch=?nameOfTextBox

Note that nameOfTextBox can be changed to any string, but can not include spaces in this view (they can be edited in Dashboards).

Using widget visualizations

Visualizing aggregated data with widgets can add additional context and assist in the creation of custom dashboards. When running a simple query, like this:

#event_simpleName=OsVersionInfo
| groupBy("ProductName")`

Selecting the desired widget from the drop down is all that’s required.

image_bar_chart_widget_230327

LogScale will only allow you to select compatible widgets.

The desired visualization widget can also be specified in the query itself. As an example:

EventType = "Event_ExternalApiEvent" ExternalApiType = "Event_DetectionSummaryEvent"
| sankey(source="Tactic",target="Technique", weight=count(AgentIdString))

image_sankey_widget_230327

The “Save” button can be leveraged to add any query or widget to a custom dashboard.

Customizing visualizations using the format pane

After creating a visualization, you can customize its appearance using the format pane on the right hand side of the screen. It’s identified by a paintbrush icon.

Let’s create a quick pie chart:

EventType="Event_ExternalApiEvent" ExternalApiType="Event_DetectionSummaryEvent"
| groupBy(Severity)

image_pie_chart_widget_230327

By clicking the paintbrush in the middle left, we can change view, color, and series options for our chart…

image_format_pane_usage_230327

When you select a visualization, the format pane will automatically adjust to include all available options. Please pick better colors than I did.

Using match statements

Using the match function can be interchangeable with the case function. A good rule of thumb is: if you know the target field you want to transform exists, there are some performance advantages with using match. An example query using match might look like this:

#event_simpleName=UserLogon event_platform=Lin
| UserIsAdmin match {
    1 => UserIsAdmin := "True" ;
    0 => UserIsAdmin := "False" ;
  }
| select([@timestamp, UserName, UID, LogonType, UserIsAdmin])

Since the field UserIsAdmin will always be in the event UserLogon, using match can help improve the performance of large queries.

The format is:

| targetField match {
    value1 => targetField := "substitution1" ;
    value2 => targetField := "substitution2" ;
  }

Using regular expression field extractions and matching

Regular expressions are an EXTREMELY powerful search tool and a core capability of LogScale. As mentioned in a previous section, regex can be invoked almost anywhere in LogScale using the query language. Below is a quick example of how to use a regular expression field extraction, combined with a case statement, to evaluate an application version. We’re looking for Chrome versions below 109.5414.

// Get InstalledApplication events for Google Chrome
#event_simpleName=InstalledApplication AppName="Google Chrome"
// Get latest AppVersion for each system
| groupBy(aid, function=([selectLast([AppVendor, AppName, AppVersion, InstallDate])]))
// Use regex to break AppVersion field into components
| AppVersion = /(?<majorVersion>\d+)\.(?<minorVersion>\d+)\.(?<buildNumber>\d+)\.(?<subBuildNumber>\d+)$/i
// Evaluate builds that need to be patched
| case {
    majorVersion>=110 | needsPatch := "No" ;
    majorVersion>=109 AND buildNumber >= 5414 | needsPatch := "No" ;
    majorVersion<=109 AND buildNumber < 5414 | needsPatch := "Yes" ;
    majorVersion<=108 | needsPatch := "Yes" ;
    * 
  }
// Check for needed update  and Organize Output
| needsPatch = "Yes"
| select([aid, InstallDate, needsPatch, AppVendor, AppName, AppVersion, InstallDate])
// Convert timestamp
| InstallDate := InstallDate *1000
| InstallDate := formatTime("%Y-%m-%d", field=InstallDate, locale=en_US, timezone=Z)

image_evaluations_with_case_statements_230327

By default, when using regular expression extractions, they are strict. Meaning if the data being searched does not match, it will be omitted. A quick example would be:

#event_simpleName=ProcessRollup2 ImageFileName=/\\(?<fileName>\w{3}\.\w{3}$)/i

What this looks for is a file with a name that is three characters long and has an extension that is three characters long. If that condition is not matched, data is not returned:

image_exclusionary_regex_230327

We can also use non-strict field extractions like so:

#event_simpleName=ProcessRollup2 ImageFileName=/\\(?<fileName>\w+\.\w+$)/i
| regex("(?<fourLetterFileName>^\w{4})\.exe", field=fileName, strict=false)
| groupBy([fileName, fourLetterFileName])

The above looks for file names that contain four characters. If that does not match, that field is left as null.

image_non_exclusionary_regex_230327

Query Building 101

Now that we have documented some useful capabilities, let’s go over the basics of building a query.

First rule, if you can start you query using any field that is tagged (demarcated with a pound sign), do it! This allows LogScale to efficiently and ruthlessly discard large swaths of events that you are not interested in. The field used most often is #event_simpleName.

In the example below, we’ll look for any PowerShell execution on a Windows system that includes flags for an encoded command line and is being run by the system user.

Okay, so the first step is we need all Windows process execution events. The easiest and quickest way to get all those events and narrow the dataset is as follows:

#event_simpleName=ProcessRollup2 event_platform=Win

Next, we’ll look for all PowerShell executions:

#event_simpleName=ProcessRollup2 event_platform=Win
| ImageFileName=/\powershell(_ise)?.exe/i

In this instance, we're using a regex function on the field ImageFileName to look for the strings powershell.exe or powershell_ise.exe. The letter i outside of the trailing forward slash indicates that it should ignore case sensitivity.

Now, we want to find command line flags that are indicative of an encoded command being run. Since there are a few options, we’ll use regex to account for the different permutations of the target flag.

#event_simpleName=ProcessRollup2 event_platform=Win
| ImageFileName=/\\powershell(_ise)?\.exe/i
| CommandLine=/\-e(nc|ncodedcommand|ncoded)?\s+/i

We need to capture the following flags (no pun intended):

-e
-enc
-encodedcommand
-encoded

Using regex, we can make a single statement that accounts for all of these.

If we wanted to get really fancy, we could pair this regex search with a string extraction to put the encoded command flag that was used in its own field. As an example:

#event_simpleName=ProcessRollup2 event_platform=Win
| ImageFileName=/\\powershell(_ise)?\.exe/i
| CommandLine=/\-(?<encodedFlagUsed>e(nc|ncodedcommand|ncoded)?)\s+/i

This performs the same search previously used, however, it now stores the flag value in a field named encodedFlagUsed.

Per our search requirements, next is making sure this is being run by the system user:

#event_simpleName=ProcessRollup2 event_platform=Win
| ImageFileName=/\\powershell(_ise)?\.exe/i
| CommandLine=/\-(?<encodedFlagUsed>e(nc|ncodedcommand|ncoded)?)\s+/i
| UserSid="S-1-5-18"

Finally, we will organize the output using groupBy to look for the least common command line variations and put them in ascending order of that count:

#event_simpleName=ProcessRollup2 event_platform=Win
| ImageFileName=/\\powershell(_ise)?\.exe/i
| CommandLine=/\-(?<encodedFlagUsed>e(nc|ncodedcommand|ncoded)?)\s+/i
| UserSid="S-1-5-18"
| groupBy([encodedFlagUsed, CommandLine], function=(count(aid, as=executionCount)))
| sort(executionCount, order=asc)

Note, if you wanted to expand this to all users — not just the system user — you could delete or comment out the fourth line in the query like so:

#event_simpleName=ProcessRollup2 event_platform=Win
| ImageFileName=/\\powershell(_ise)?\.exe/i
| CommandLine=/\-(?<encodedFlagUsed>e(nc|ncodedcommand|ncoded)?)\s+/i
// | UserSid="S-1-5-18"
| groupBy([encodedFlagUsed, CommandLine], function=(count(aid, as=executionCount)))
| sort(executionCount, order=asc)

You could also add a threshold, if desired with the test command:

#event_simpleName=ProcessRollup2 event_platform=Win
| ImageFileName=/\\powershell(_ise)?\.exe/i
| CommandLine=/\-(?<encodedFlagUsed>e(nc|ncodedcommand|ncoded)?)\s+/i
//| UserSid="S-1-5-18"
| groupBy([encodedFlagUsed, CommandLine], function=(count(aid, as=executionCount)))
| test(executionCount < 10)
| sort(executionCount, order=asc)

We could trim the CommandLine string using format to only include the first 100 characters to make things more readable. We would add this before our final aggregation:

| format("%,.100s", field=CommandLine, as=CommandLine)

And now we have a complete query!

If we wanted to do some visualization, we could change our parameters a bit to look for outliers:

image_final_output_trimmed_230327

Based on this data, the use of the flags enc and encodedCommand (with that spelling) are not common in my environment. A hunting query, scheduled alert, or Custom IOA could be beneficial.

Adding hyperlinks to search output

When creating dashboards or saved searches, the ability to create hyperlinks using fields from the search results can come in handy. For this, we’ll use the function format. As an example, let’s say we want to: search for executable files written to a system’s "Downloads" folder, create a list, and include a hyperlink to VirusTotal and Hybrid Analysis.

First, let’s get the data we need and do some simple regex extractions:

#event_simpleName=PeFileWritten
| TargetFileName=/\\Downloads\\/
| TargetFileName=/(?<FilePath>.+\\)(?<FileName>.+$)/i

Now, we’ll use our format command to create hyperlinks and organize with groupBy:

#event_simpleName=PeFileWritten
| TargetFileName=/\\Downloads\\/
| TargetFileName=/(?<FilePath>.+\\)(?<FileName>.+$)/i
// Virus Total
| format("[Virus Total](https://www.virustotal.com/gui/file/%s)", field=[SHA256HashData], as="VT")
// Hybrid Analysis
| format("[Hybrid Analysis](https://www.hybrid-analysis.com/search?query=%s)", field=[SHA256HashData], as="HA")
| groupBy([aid, FileName], function=collect([FileName, FilePath, VT, HA]))

2023-04-05_11-27-15

If you click the "Virus Total" or "Hybrid Analysis" link, it will direct you to the appropriate website with the correct data included for the given SHA256.

We can break this down a bit more. What we’re doing is, colloquially, called dorking. Meaning: the URL format for VirusTotal and Hybrid Analysis is known. If we create a URL, and include some data from our search results, we can hard link to external information. For Hybrid Analysis, the format of the URL is:

www.hybrid-analysis.com/search?query=SHA256VALUE

The red section is dynamic, but we have the SHA256 value or the files we’re interested in in our query results. So, using format, we can dork, this hyperlink:

| format("[Hybrid Analysis](https://www.hybrid-analysis.com/search?query=%s)", field=[SHA256HashData], as="HA")

The text between the braces is what the hyperlink name will be. The text in the "as" statement will be the column title in the aggregate output. You could use this link any time you have a SHA256 value in LogScale.

You could also, as an example, use an Agent ID value to make a one-click RTR link:

// RTR
| format("[RTR Link](https://falcon.crowdstrike.com/
activity/real-time-response/console?start=hosts&aid=%s=f)", field=[aid], as="RTR")

Clicking this link, will initiate an RTR session for the aid associated with the event.

2023-04-05_11-32-08

If you have other third-party or internal tooling or resources, make sure to check if the associated URLs are standardized and dorkable.

Helpful Query Examples

Trim Field String Length

Trim a field to be shorter. Only show the first 50 characters of the field CommandLine:

#event_simpleName=ProcessRollup2
| format("%,.50s", field=CommandLine, as=shortCmdLine)
| select([shortCmdLine, CommandLine])

Format a number to two decimal places:

Calculate the average password age, in days, of Admin and Standard user accounts and then round those values to a whole number:

#event_simpleName=UserLogon
| groupBy(UserIsAdmin, function=(avg(PasswordLastSet, as=avgPswdAge)))
| AvgPaswdAgeDays := now() - (avgPswdAge*1000)
| AvgPaswdAgeDays := AvgPaswdAgeDays/1000/60/60/24
| format("%,.0f",field=["AvgPaswdAgeDays"], as="PasswordLastSetDaysRounded")

Set a default field value

Set a default field value. Similar to fillnull in Splunk. If the field GrandParentBaseFilename is blank, set it to the value "Unknown."

#event_simpleName=ProcessRollup2
| default(field=GrandParentBaseFileName, value="Unknown")
| select([GrandParentBaseFileName, ParentBaseFileName, ImageFileName])

Use test to do field evaluations

Similar to where in Splunk, check one field value against another. In this example, see if a username has completed more than 25 logons in a given search window:

#event_simpleName=UserLogon
| groupBy(UserName, function=(count(aid, as=totalLogons)))
| test(totalLogons > 25)

Place longitude and latitude on a world map

Using a built-in geoip database and the function worldmap to map all SSH connections.

#event_simpleName=UserLogon event_platform=Lin LogonType="10"
| !cidr(RemoteAddressIP4, subnet=["224.0.0.0/4", "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16", "127.0.0.0/32", "169.254.0.0/16", "0.0.0.0/32"])
| ipLocation(RemoteAddressIP4)
| worldMap(lat=RemoteAddressIP4.lat, lon=RemoteAddressIP4.lon, magnitude=count(RemoteAddressIP4))

Omit RFC-1819 addresses from search result

Omit internal IP addresses when keying in on remote IP addresses.

#event_simpleName=UserLogon 
| !cidr(RemoteAddressIP4, subnet=["224.0.0.0/4", "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16", "127.0.0.0/32", "169.254.0.0/16", "0.0.0.0/32"])

Create concatenated, formatted fields

Create a process lineage one-liner that includes grandparent, parent, and target processes using format:

#event_simpleName=ProcessRollup2
| default(field=GrandParentBaseFileName, value="Unknown")
| format(format="%s > %s > %s", field=[GrandParentBaseFileName,  ParentBaseFileName, FileName], as="processLineage")
| select([aid, TargetProcessId, processLineage])

Query field for multiple values with in

Query a single field for multiple values using the in function.

#event_simpleName=UserLogon
| in(LogonType, values=["2","10"])
| select([@timestmap, UserSid, UserName, LogonType, ClientComputerName])

Concatenate two fields into a new, single field

Take the fields UID and UserSid and make them a new field named userIdentifier.

#event_simpleName=ProcessRollup2
| concat([UID, UserSid], as="userIdentifier")
| select([@timestamp, aid, userIdentifier, ImageFileName])

Use match to query a lookup table

Note that the specified lookup table must exist under "Files." In this instance, we’ll check for common ports and create a new field with their associated protocol.

#event_simpleName=NetworkListenIP4
| match(file="service-names-port-numbers.csv", field=LocalPort, include=ServiceName, ignoreCase=true, strict=false)
| select([@timestamp, aid, LocalPort, ServiceName, RemoteAddressIP4])

Convert decimal values to hexadecimal values

#event_simpleName=UserLogonFailed2 event_platform=Win
| SubStatus_hex := format(field=SubStatus, "0x%X")
| select([@timestamp, aid, UserName, LogonType, SubStatus, SubStatus_hex])
Clone this wiki locally