I might just be missing a term or a concept, as Googling things like "measure acyclic events" mostly brings up non-relevant results (like DAGs). I'd be happy to continue researching on my own given any pointer (words) in the right direction.
Is there a technique suited for visualizing the "surprisingness" of events, given time-series data for many events?
When I'm woken up for an on-call alert at 3am and I open up a 200 MB server log on some unfamiliar cloud instance, it can be hard to tell whether some of the recent log events are "normal" or "interesting":
2018-10-10T22:08:36.771+00:00 f0fe6c8f872a adwallet[1]: Service adgenerator-sandbox: unable to sample a connection
2018-10-10T22:08:36.774+00:00 f0fe6c8f872a adwallet[1]: 2 1:[ ] ApolloIncomingRequest 2 POST hm://adwallet-sandbox/internal/v1/receipt/send 0577e7186bf5ba-6bb9c9-be0d-7ce4-00000000
2018-10-10T22:08:36.776+00:00 f0fe6c8f872a adwallet[1]: sending reply: V2Message{uri='hm://adwallet-sandbox/internal/v1/receipt/send', preamble=Preamble{command=REPLY, hasBody=true, traceFlag=false, p
2018-10-10T22:08:36.783+00:00 f0fe6c8f872a adwallet[1]: Service adgenerator-sandbox: unable to sample a connection
2018-10-10T22:08:36.787+00:00 f0fe6c8f872a adwallet[1]: sending request: V2Message{uri='hm://adgenerator-sandbox/internal/v1/ads/02b882b2-ba37-4a99-bf01-4b14119a9a44/createdByUserId', preamble=Preambl
2018-10-10T22:08:36.796+00:00 f0fe6c8f872a adwallet[1]: message received: [id: 0xf2ac5273, /w.x.y.z:58477 => test-adgeneratorsandbox-a-gzh5.test.foo.net./w.x.y.z:32371], ZMTPIncomingMessag
2018-10-10T22:08:36.798+00:00 f0fe6c8f872a adwallet[1]: 3 1:[ ] ApolloOutgoingRequest 1 GET hm://adgenerator-sandbox/internal/v1/ads/02b882b2-ba37-4a99-bf01-4b14119a9a44/createdByUserId
2018-10-10T22:08:36.808+00:00 f0fe6c8f872a adwallet[1]: creating new client for srv://adsgtm-sandbox.services.test.foo.net
2018-10-10T22:08:36.811+00:00 f0fe6c8f872a adwallet[1]: Lazily loaded a hermes client for endpoint 'srv://adsgtm-sandbox.services.test.foo.net'. Consider preloading your clients for dependencies t
2018-10-10T22:08:36.825+00:00 f0fe6c8f872a adwallet[1]: disconnected: [id: 0xac497c9e, /w.x.y.z:58113 :> /w.x.y.z:5700]
2018-10-10T22:08:36.827+00:00 f0fe6c8f872a adwallet[1]: Service adsgtm-sandbox: unable to sample a connection
2018-10-10T22:08:36.831+00:00 f0fe6c8f872a adwallet[1]: Service adsgtm-sandbox: endpoints changing, adding 1 and removing 0 connections
2018-10-10T22:08:36.834+00:00 f0fe6c8f872a adwallet[1]: Service adsgtm-sandbox: unable to sample a connection
2018-10-10T22:08:36.835+00:00 f0fe6c8f872a adwallet[1]: Configuring TCP keepalive: fd=227, channel=[id: 0x4f603eec, /w.x.y.z:48075 => test-adsgtmsandbox-z-822v.test.foo.net./w.x.y.z:29278]
2018-10-10T22:08:36.836+00:00 f0fe6c8f872a adwallet[1]: TCP_KEEPCNT: fd=227, channel=[id: 0x4f603eec, /w.x.y.z:48075 => test-adsgtmsandbox-z-822v.test.foo.net./w.x.y.z:29278]: before: get(
2018-10-10T22:08:36.836+00:00 f0fe6c8f872a adwallet[1]: TCP_KEEPIDLE: fd=227, channel=[id: 0x4f603eec, /w.x.y.z:48075 => test-adsgtmsandbox-z-822v.test.foo.net./w.x.y.z:29278]: before: get
2018-10-10T22:08:36.836+00:00 f0fe6c8f872a adwallet[1]: TCP_KEEPINTVL: fd=227, channel=[id: 0x4f603eec, /w.x.y.z:48075 => test-adsgtmsandbox-z-822v.test.foo.net./w.x.y.z:29278]: before: ge
2018-10-10T22:08:36.840+00:00 f0fe6c8f872a adwallet[1]: connected: [id: 0x4f603eec, /w.x.y.z:48075 => test-adsgtmsandbox-z-822v.test.foo.net./w.x.y.z:29278]
2018-10-10T22:08:36.845+00:00 f0fe6c8f872a adwallet[1]: Service adsgtm-sandbox: unable to sample a connection
2018-10-10T22:08:36.850+00:00 f0fe6c8f872a adwallet[1]: sending request: V2Message{uri='hm://adsgtm-sandbox/v2/transaction/adyen/8525343641112390', preamble=Preamble{command=REQUEST, hasBody=false, tr
2018-10-10T22:08:36.873+00:00 f0fe6c8f872a adwallet[1]: message received: [id: 0x4f603eec, /w.x.y.z:48075 => test-adsgtmsandbox-z-822v.test.foo.net./w.x.y.z:29278], ZMTPIncomingMessage{ses
2018-10-10T22:08:36.876+00:00 f0fe6c8f872a adwallet[1]: 4 1:[ ] ApolloOutgoingRequest 1 GET hm://adsgtm-sandbox/v2/transaction/adyen/8525343641112390 0577e71890376c-57afb0-d03f-0001-
2018-10-10T22:08:36.976+00:00 f0fe6c8f872a adwallet[1]: creating new client for srv://ad-account-service-sandbox.services.test.foo.net
2018-10-10T22:08:36.978+00:00 f0fe6c8f872a adwallet[1]: Lazily loaded a hermes client for endpoint 'srv://ad-account-service-sandbox.services.test.foo.net'. Consider preloading your clients for de
2018-10-10T22:08:36.989+00:00 f0fe6c8f872a adwallet[1]: Service ad-account-service-sandbox: unable to sample a connection
2018-10-10T22:08:36.992+00:00 f0fe6c8f872a adwallet[1]: Service ad-account-service-sandbox: unable to sample a connection
2018-10-10T22:08:37.001+00:00 f0fe6c8f872a adwallet[1]: Service ad-account-service-sandbox: endpoints changing, adding 1 and removing 0 connections
2018-10-10T22:08:37.005+00:00 f0fe6c8f872a adwallet[1]: Configuring TCP keepalive: fd=253, channel=[id: 0xbefd5d13, /w.x.y.z:48508 => test-adaccountservicesandbox-a-rg7f.test.foo.net./10.174.20.
2018-10-10T22:08:37.006+00:00 f0fe6c8f872a adwallet[1]: TCP_KEEPCNT: fd=253, channel=[id: 0xbefd5d13, /w.x.y.z:48508 => test-adaccountservicesandbox-a-rg7f.test.foo.net./w.x.y.z:27330]: bef
2018-10-10T22:08:37.006+00:00 f0fe6c8f872a adwallet[1]: TCP_KEEPIDLE: fd=253, channel=[id: 0xbefd5d13, /w.x.y.z:48508 => test-adaccountservicesandbox-a-rg7f.test.foo.net./w.x.y.z:27330]: be
2018-10-10T22:08:37.006+00:00 f0fe6c8f872a adwallet[1]: TCP_KEEPINTVL: fd=253, channel=[id: 0xbefd5d13, /w.x.y.z:48508 => test-adaccountservicesandbox-a-rg7f.test.foo.net./w.x.y.z:27330]: b
2018-10-10T22:08:37.006+00:00 f0fe6c8f872a adwallet[1]: Service ad-account-service-sandbox: unable to sample a connection
2018-10-10T22:08:37.008+00:00 f0fe6c8f872a adwallet[1]: connected: [id: 0xbefd5d13, /w.x.y.z:48508 => test-adaccountservicesandbox-a-rg7f.test.foo.net./w.x.y.z:27330]
2018-10-10T22:08:37.011+00:00 f0fe6c8f872a adwallet[1]: sending request: V2Message{uri='hm://ad-account-service-sandbox/proto/AdAccountService/v1/getUserAccount', preamble=Preamble{command=REQUEST, ha
2018-10-10T22:08:37.150+00:00 f0fe6c8f872a adwallet[1]: message received: [id: 0xbefd5d13, /w.x.y.z:48508 => test-adaccountservicesandbox-a-rg7f.test.foo.net./w.x.y.z:27330], ZMTPIncomingMe
2018-10-10T22:08:37.152+00:00 f0fe6c8f872a adwallet[1]: 5 1:[ ] ApolloOutgoingRequest 1 POST hm://ad-account-service-sandbox/proto/AdAccountService/v1/getUserAccount 0577e71892c851-5
2018-10-10T22:08:37.171+00:00 f0fe6c8f872a adwallet[1]: Getting info for ad 02b882b2-ba37-4a99-bf01-4b14119a9a44
What I generally do is look for a suspicious message then search around to see if that message has occurred before. The naive thought would be to treat something that has never happened before as "interesting," and less so with increasing frequency. But no, I want to incorporate the regularity of events. If a server reboots everyday at 5am sharp but there is a reboot at 2pm, then that's highly irregular. Then there are events that just happen constantly but not regularly at all (e.g. Service ad-account-service-sandbox: unable to sample a connection), so that no events are truly "surprising."
I think what I don't understand is how to model the frequency of events, and the error "from" that frequency.
Is there something in the space of DSP might be able to do a brilliant job of this, e.g. What does the Fourier transform of the combined time-series of events represent if I mapped each unique event on a discrete amplitude? (After stripping things like timestamps, UUIDs, and all numbers in general, since I'm interested in the classes of events, not the specifics.)
I don't know the term for it, but I did some similar work before. The aim was to display in a different color the jobs starting out of time, either because it was a specific run for dev or test, either because the job was delayed by waiting for some previous job to finish.
The solution was to train the system (a big word for evaluating the probability of a run in a time discreetized in slots of 30mn), and to report all the run with a low probability of starting at the observed start time.
It worked very well, except it did not answer the burning question : "Which job did not started as it should?" [The errors of the second type]. In you framework: if a job usually starts at 5am (±15mn), the facts that it did not started on the Oct 15th, should also be an "interesting" event.
Please, keep me up to date on your progress.