Quickly identify high amounts of skipped searches in your cluster or standalone SH(s): index = _internal skipped sourcetype=scheduler status=skipped host=[your splunk SH(s)] | stats count by app search_type reason savedsearch_name | sort -count Adjust “[your splunk SH(s)]” to the SH(s) you want to check obviously ;)
find blocking queues
Blocked queues are (obviously) bad for your environment so here a search to identify those: index=_internal sourcetype=splunkd group=queue (name=parsingQueue OR name=indexqueue OR name=tcpin_queue OR name=aggqueue) | eval is_blocked=if(blocked==”true”,1,0), host_queue=host.” – “.name | stats sparkline sum(is_blocked) as blocked,count by host_queue | eval blocked_ratio=round(blocked/count*100,2) | sort 20 -blocked_ratio | eval requires_attention=case(blocked_ratio>50.0,”fix highly recommended!”,blocked_ratio>40.0,”you better check..”,blocked_ratio>20.0,”usually no need […]
Linux Free Disk Space
The following Splunk query shows a percentage of free disk space over a period of time using timechart: index=os sourcetype=df PercentFreeSpace=* mount=”/” | timechart latest(PercentFreeSpace) by host
Linux Memory Usage
The following Splunk Search will show memory usage on a linux machine over a period of time using timechart: index=os sourcetype=top pctMEM=*| transaction host _time | streamstats window=1 global=f sum(pctMEM) as MEM | timechart latest(MEM) by host
Linux CPU Usage
The following query will output CPU usage per host over a period of time using timechart: index=os sourcetype=top pctCPU=* | transaction host _time | streamstats window=1 global=f sum(pctCPU) as CPU | timechart latest(CPU) by host