Splunk : Announcing the Preview of Splunk APM’s AlwaysOn Profiling

October 21, 2021 at 11:54 am EDT

By Mat Ball October 21, 2021

For application developers and service owners who build and troubleshoot modern enterprise software, resolving production issues requires identifying poor performance across multiple networks, operating systems, servers, configs, and third party dependencies. When the problem is the code itself, code profiling helps identify service bottlenecks by periodically taking CPU snapshots, or call stacks, from a runtime environment. Information from call stacks provides additional context for slow spans from transaction traces, and helps visualize bottlenecks through flamegraphs, to show service performance over time. These benefits speak for themselves, but most other code profiling products incur notable performance overhead, which requires engineers to manually switch them on or off, creating a tradeoff between application performance and available data.

We're proud to announce the Beta of AlwaysOn Profiling, part of Splunk APM. Available initially for Java-based applications, AlwaysOn provides continuous visibility of code-level performance, linked with unsampled trace data, with minimal overhead. Along with Splunk Synthetic Monitoring, Splunk RUM, Splunk Infrastructure Monitoring, Splunk Log Observer, and Splunk On-Call, AlwaysOn Profiling gives engineers more context to identify performance issues and troubleshoot faster across production environments.

Troubleshooting Code Bottlenecks with AlwaysOn Profiling

Splunk APM's AlwaysOn Profiler is constantly monitoring code performance to give you immediate context of where performance bottlenecks exist. Here are two examples of how AlwaysOn can help identify production issues:

Workflow One: Viewing Common Code in Your Slowest Traces
Engineers troubleshooting production issues often sort through example traces looking for common attributes in their slowest spans. AlwaysOn's call stacks are linked to trace data, providing context into which code is executed during each trace.

Within APM you can easily view latency within your production environment.

By clicking into any service you're taken to the service maps, which provide additional context on bottlenecks within that service and its dependencies.

From here, we can explore example traces.

Note: We filtered the "min" by 10,000, or ten seconds, to focus specifically on the slowest traces. We see that requests to /stats/races/fastest repeatedly respond in around 40+ seconds.

By clicking into one of these long trace, the following screen opens:

We see that while the StatsController.fastestRace operation was being executed, we collected 36 call stacks. As the java agent continuously collects call stacks, the longer the spans, the more call stacks they will have. When I open this span, I see the metadata on the left, and the call stacks that the agent collected on the right. We can use the "Previous" and "Next" buttons to flip through all call stacks:

If you see several consecutive call stacks pointing to the same line of code, it indicates that these lines take a long time to execute, or execute many times in a row. This is often a solid hint at a performance bottleneck.

Workflow Two: Viewing aggregate performance of services over time
Before you begin optimizing code, it's always helpful to understand which part of your source code impacts performance the most. How do you know which part is the biggest bottleneck? This is where aggregation of collected call stacks, in the form of flamegraphs, helps.

When viewing your service map, notice the code profiling addition on your right side panel, which automatically shows you the top five frames from the call stacks we've collected for your selected time range, that already point to bottlenecks in code.

By clicking into the feature, you're taken to a flame graph, which is a visual aggregation of call stacks collected from the time range you've specified. Flame graphs visualize call stacks across a time range - the larger the horizontal bar, the more frequently that line of code is found in the collected call stacks.

Upon viewing the flamegraph, focus on larger top down "pillars", which indicate lines of code that use the CPU the most. If you want to highlight your own code classes in the flamegraph, use the filter in the top left.

Within each horizontal bar of the flamegraph, there are class names and line numbers for your code. Flame graphs point you to the bottleneck causing the slowness, and the final step in troubleshooting is returning to your source code itself to fix the problem.

Code Profiling within Splunk's Observability Solutions

Unlike dedicated code profiling solutions, Splunk's AlwaysOn Profiler links collected call stacks to spans that are being executed at the time of call stack collection. This helps separate data about the background threads from active threads which service incoming requests, greatly reducing the amount of time engineers need to analyze profiling data.

Additionally, with Splunk's AlwaysOn profiler, all of the data collection is automatic, and low overhead. Instead of having to switch the profiler on during production incidents, users only need to deploy the Splunk-flavored OpenTelemetry agent and it begins to continuously collect data in the background.

Try It Today!

With "Always On" profiling, teams using Splunk APM can now analyze and improve both intra-service performance of code heavy monoliths, and inter-service performance of microservice based architectures, to troubleshoot bottlenecks and optimize service performance at any stage of cloud migration.

Sign up for the preview to get started today.

Follow all the conversations coming out of #splunkconf21!

Follow @splunk

Attachments

Original document
Permalink

Disclaimer

Splunk Inc. published this content on 21 October 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 21 October 2021 15:53:09 UTC.

	1st Jan change	Capi.
MICROSOFT CORPORATION	+8.78%	3,028B
SYNOPSYS INC.	+2.45%	80.47B
CADENCE DESIGN SYSTEMS, INC.	+1.94%	76.81B
DASSAULT SYSTÈMES SE	-11.97%	54.2B
ATLASSIAN CORPORATION	-16.33%	51.74B
PALANTIR TECHNOLOGIES INC.	+25.74%	47.88B
THE TRADE DESK, INC.	+16.44%	39.62B
SEA LIMITED	+56.22%	35.14B
TAKE-TWO INTERACTIVE SOFTWARE, INC.	-11.26%	24.18B

1st Jan change

Capi.

MICROSOFT CORPORATION

+8.78%

3,028B

SYNOPSYS INC.

+2.45%

80.47B

CADENCE DESIGN SYSTEMS, INC.

+1.94%

76.81B

DASSAULT SYSTÈMES SE

-11.97%

54.2B

ATLASSIAN CORPORATION

-16.33%

51.74B

PALANTIR TECHNOLOGIES INC.

+25.74%

47.88B

THE TRADE DESK, INC.

+16.44%

39.62B

SEA LIMITED

+56.22%

35.14B

TAKE-TWO INTERACTIVE SOFTWARE, INC.

-11.26%

24.18B

ANALYST RECOMMENDATIONS : Best Buy, Wells Fargo, AMD, Netflix, Nvidia...	Mar. 20
Splunk Inc.(NasdaqGM:SPLK) dropped from FTSE All-World Index	Mar. 19	CI
Splunk Inc.(NasdaqGS:SPLK) dropped from S&P Software & Services Select Industry Index	Mar. 19	CI
Splunk Inc.(NasdaqGS:SPLK) dropped from S&P TMI Index	Mar. 19	CI
Splunk Inc.(NasdaqGS:SPLK) dropped from S&P Global BMI Index	Mar. 19	CI
ANALYST RECOMMENDATIONS : 3M Company, Snowflake, Splunk, Micron, Nvidia...	Mar. 19
How Cisco Will Integrate Splunk Into Company	Mar. 18	MT
Cisco: completes acquisition of Splunk for $28 billion	Mar. 18	CF
Splunk Inc.(NasdaqGS:SPLK) dropped from NASDAQ Composite Index	Mar. 17	CI
Cisco Systems, Inc. entered into an agreement and plan of merger to acquire Splunk Inc. from Hellman & Friedman Capital Partners X, L.P., managed by Hellman & Friedman LLC, BlackRock, Inc., The Vanguard Group, Inc., PRIMECAP Management Company and others.	Mar. 17	CI
Splunk Inc.(NasdaqGS:SPLK) dropped from NASDAQ-100 Index	Mar. 14	CI
Add a little SaaS to your life	Mar. 14
EU Watchdog Green-lights Cisco Systems' Purchase of Splunk	Mar. 14	MT
Cisco gains EU antitrust nod for $28 billion Splunk acquisition	Mar. 14	RE
Oracle posts rise in quarterly profit on strong cloud demand	Mar. 11	RE
Linde to Join Nasdaq-100 Index	Mar. 11	MT
Cisco's Splunk deal set to win unconditional EU antitrust OK, sources say	Mar. 05	RE
GitLab shares drop as 'less conservative' forecast disappoints investors	Mar. 05	RE
Splunk beats quarterly revenue estimates on steady demand for cloud services	Feb. 27	RE
Splunk Fiscal Q4 Earnings, Revenue Rise	Feb. 27	MT
Earnings Flash (SPLK) SPLUNK Posts Q4 Revenue $1.49B, vs. Street Est of $1.27B	Feb. 27	MT
Splunk Inc. Reports Earnings Results for the Full Year Ended January 31, 2024	Feb. 27	CI
Splunk Inc. Reports Earnings Results for the Fourth Quarter and Full Year Ended January 31, 2024	Feb. 27	CI
Equities Mixed as Traders Parse Economic Data, Fed Governor Remarks	Feb. 27	MT
Cisco to lay off 5% of workforce, cuts annual revenue forecast	Feb. 14	RE

Splunk Inc.

Equities

SPLK

US8486371045

Software

Splunk : Announcing the Preview of Splunk APM’s AlwaysOn Profiling

Latest news about Splunk Inc.

Chart Splunk Inc.

Company Profile

Sector Other Software