A log is a collection of messages from various network devices and hardware in chronological order. They are generated automatically in response to the occurrence of events and include natural language notes. Logs can be routed to files on hard disks or sent as a stream of messages over the network to a log collector. Logs enable the process of monitoring and maintaining hardware/software performance, parameter tweaking, software and system emergency and recovery, and application and infrastructure optimization.
Natural Language processing techniques are widely used in Log Analysis and Log Mining:
- Log analysis is the process of extracting information from logs considering the different syntax and semantics of messages in the log files and interpreting the context with application to have a comparative analysis of log files coming from various sources for anomaly detection and finding correlations.
- Log Mining, also known as Log Knowledge Discovery, is the process of extracting patterns and correlations from logs in order to uncover knowledge and forecast Anomaly Detection if any are contained within log messages.
To convert log messages into structured form, many techniques such as tokenization, stemming, lemmatization, and parsing are utilized. Once well-documented logs are accessible, log analysis and log mining are used to extract relevant information and knowledge from the data.