Grafana: Top N values: Unterschied zwischen den Versionen

Aus Wiki-WebPerfect
Wechseln zu: Navigation, Suche
Zeile 1: Zeile 1:
  
== Top N values single ==
+
== Top N values ==
 
=== Telegraf configuration ===
 
=== Telegraf configuration ===
 
''Because there is no Perfmon-Counter for the total size of Cluster Shared Volumes (CSV), I wrote a PowerShell CMDLET.''
 
''Because there is no Perfmon-Counter for the total size of Cluster Shared Volumes (CSV), I wrote a PowerShell CMDLET.''
Zeile 46: Zeile 46:
  
  
 +
== Top N values for multiple instances ==
 +
''You only have to use the following InfluxQL if you have multiple "instance" and you have to sum the instance for each node before select the top.''
 +
''In this example each host has two Network interfaces and we want to know the utilization of the whole host and not per each NIC.''
  
 +
=== Telegraf configuration ===
 +
<pre>
 +
  [[inputs.win_perf_counters.object]]
 +
    ObjectName = "Network Interface"
 +
    Instances = ["*"]
 +
    Counters = [
 +
      "Bytes Received/sec",
 +
      "Bytes Sent/sec",
 +
      "Current Bandwidth",
 +
    ]
 +
    Measurement = "win_net" </pre>
  
  
  
SELECT top("Average","host",$top) AS "Average" FROM (
+
=== InfluxQL in Grafana ===
  SELECT sum("Average") AS "Average" FROM (
+
  <pre>
    SELECT ((100 / (mean("Current_Bandwidth") / 8)) * (mean("Bytes_Sent_persec") + mean("Bytes_Received_persec"))) AS "Average"
+
  SELECT top("Average","host",5) AS "Average" FROM (
    FROM "$rp"."win_net"  
+
    SELECT sum("Average") AS "Average" FROM (
    WHERE $timeFilter AND "host" =~ $hostfilter AND "instance" !~ /^Hyper-V.+/
+
      SELECT ((100 / (mean("Current_Bandwidth") / 8)) * (mean("Bytes_Sent_persec") + mean("Bytes_Received_persec"))) AS "Average"
    GROUP BY "host", "instance"
+
      FROM "$rp"."win_net"  
  ) GROUP BY "host"
+
      WHERE $timeFilter AND "instance" !~ /^Hyper-V.+/
)
+
      GROUP BY "host", "instance"
 +
    ) GROUP BY "host"
 +
  ) </pre>
 +
Description of the Query: <br>
 +
''First Query: Convert the "Current_Bandwidth" from Bits/sec to Bytes/sec and the calculate used Bandwidth (received & sent) in percentage. Use only instance that are not equal "Hyper-V..." and group this by "host" and "instance".'' <br>
 +
''Second Query: Sum the utilization of the two NIC's per each host.'' <br>
 +
''Third Query: Selects the top 5 hosts based on the utilization of his NICs.'' <br>
 +
 
 +
'''Grafana Panel settings:''' It like Grafana settings from above.
 +
 
  
  

Version vom 6. Oktober 2020, 16:11 Uhr

Top N values

Telegraf configuration

Because there is no Perfmon-Counter for the total size of Cluster Shared Volumes (CSV), I wrote a PowerShell CMDLET.

 [[inputs.exec]]
   commands = ['''powershell.exe -NoProfile -Command "Get-Volume | Where-Object {$_.FileSystem -eq 'CSVFS'} | select FileSystemLabel, AllocationUnitSize, Size, SizeRemaining, @{N='SizeUsed';E={$_.Size - $_.SizeRemaining}} | ConvertTo-Json"''']
   name_override = "cluster_csv"
   data_format = "json"
   data_type = "float"
   tag_keys = ["FileSystemLabel"] 

Measurement = cluster_csv
Tags = FileSystemLabel


InfluxQL in Grafana

Size = Total size of the CSV
SizeUsed = Used size of the CSV
FileSystemLabel = Is the name of the CSV

InfluxQL:

SELECT top("UsedSpace (%)","FileSystemLabel",5) AS "UsedSpace (%)" FROM (
  SELECT (100 / mean("Size")) * mean("SizeUsed") AS "UsedSpace (%)"
  FROM "$rp"."cluster_csv" 
  WHERE $timeFilter 
  GROUP BY "FileSystemLabel"
) 

Description of the Query:
$rp: is a Grafana template variable to select InfluxDB retention policy. If you use this in as a template variable you can change the retention policy for the whole dashboard (you don't have to change each panel). First Query: Grafana selects and calculates the percentage of used space per CSV in the timerange of the dashboard.
Second Query: Grafana selects the top 5 CSV's based on the value of "UsedSpace (%)" per "FileSystemLabel" (CSV-Name).
Conclusion: With this query you have a table of the 5 most used space per CSV's.

Grafana Panel settings:
FORMAT AS = Table
Visualization = Table
Transform = Organize fields = Hiding "Time" and renaming FileSystemLabel to CSV Overrides = Fields with name = "UsedSpace (%)" -> Unit = "Percent (0-100)", Decimals = 1, Cell Display mode = "Gradient gauge"



Top N values for multiple instances

You only have to use the following InfluxQL if you have multiple "instance" and you have to sum the instance for each node before select the top. In this example each host has two Network interfaces and we want to know the utilization of the whole host and not per each NIC.

Telegraf configuration

  [[inputs.win_perf_counters.object]]
    ObjectName = "Network Interface"
    Instances = ["*"]
    Counters = [
      "Bytes Received/sec",
      "Bytes Sent/sec",
      "Current Bandwidth",
    ]
    Measurement = "win_net" 


InfluxQL in Grafana

  SELECT top("Average","host",5) AS "Average" FROM (
    SELECT sum("Average") AS "Average" FROM (
      SELECT ((100 / (mean("Current_Bandwidth") / 8)) * (mean("Bytes_Sent_persec") + mean("Bytes_Received_persec"))) AS "Average"
      FROM "$rp"."win_net" 
      WHERE $timeFilter AND "instance" !~ /^Hyper-V.+/
      GROUP BY "host", "instance"
    ) GROUP BY "host"
  ) 

Description of the Query:
First Query: Convert the "Current_Bandwidth" from Bits/sec to Bytes/sec and the calculate used Bandwidth (received & sent) in percentage. Use only instance that are not equal "Hyper-V..." and group this by "host" and "instance".
Second Query: Sum the utilization of the two NIC's per each host.
Third Query: Selects the top 5 hosts based on the utilization of his NICs.

Grafana Panel settings: It like Grafana settings from above.