dimension-occurrence-stats.sql

USAGE:  event1|event2|sensor|type|user [--long]

If the --long option is not given, only basic information is shown. For example, a partial sample output of psqlf /opt/lce/tools/pg-helper-sql/dimension-occurrence-stats.sql type is:

nickn name total # rank by total # First Seen Latest Seen
2 network 39,234,655 1 2019 Aug23 15:30 2019 Sep19 18:30
15 database 1,765,733 19 2019 Aug24 16:30 2019 Sep10 19:30
10 restart 1,679,365 20 2019 Aug24 13:00 2019 Sep17 06:30

This table shows that no events of type=database have been normalized since Sep 10th. Information such as this can be helpful in troubleshooting PRM issues.

 

If the --long option is given, these additional 5 columns (min, 25th %, median, 75th %, and max) are generated. For example, a partial sample output of psqlf /opt/lce/tools/pg-helper-sql/dimension-occurrence-stats.sql type --long would be:

nickn name total # rank by total # First Seen Latest Seen min 25th % median 75th % max
2 network 39,234,655 1 2019 Aug23 15:30 2019 Sep19 18:30 0.0073 0.9783 0.9939 0.9987 0.9997
15 database 1,765,733 19 2019 Aug24 16:30 2019 Sep10 19:30 0.0093 0.0094 0.0094 0.0094 0.0102
10 restart 1,679,365 20 2019 Aug24 13:00 2019 Sep17 06:30 0.0001 0.0002 0.0087 0.0092 0.0098

The last 5 columns on the right show what fraction of events are normalized with respective dimension. For example, name = network:

  • 0.0073 (equivalent to 0.73%) — minimum, also called 0th percentile

  • At some point , events constituted fraction of 0.9997 (equivalent to 99.97%) — maximum, also called 100th percentile

  • The fraction of 0.9939 (equivalent to 99.39%) has been the median average, also called 50th percentile) The median is a superior way to measure a dataset's average, because it is not as easily skewed by outliers.