sos.cleaner.parsers — Cleaning Parser Definition

class sos.cleaner.parsers.SoSCleanerParser(conf_file=None)[source]

Bases: object

Parsers are used to build objects that will take a line as input, parse it for a particular pattern (E.G. IP addresses) and then make any necessary subtitutions by referencing the SoSMap() associated with the parser.

Ideally a new parser subclass will only need to set the class level attrs in order to be fully functional.


conf_file (str) – The configuration file to read from

  • name (str) – The parser name, used in logging errors
  • regex_patterns (list) – A list of regex patterns to iterate over for every line processed
  • mapping (SoSMap()) – Used by the parser to store and obfuscate matches
  • map_file_key (str) – The key in the map_file to read when loading previous obfuscation matches
  • prep_map_file – File to read from an archive to pre-seed the map with matches. E.G. ip_addr for loading IP addresses

Get the contents of the mapping used by the parser

Returns:All matches and their obfuscate counterparts
Return type:dict

This will be called for every line in every file we process, so that every parser has a chance to scrub everything.

Parameters:line (str) – The line to parse for possible matches for obfuscation
Returns:The obfsucated line, and the number of changes made
Return type:tuple, (str, int))

Parse a given string for instances of any obfuscated items, without applying the normal regex comparisons first. This is mainly used to obfuscate filenames that have, for example, hostnames in them.

Rather than try to regex match the string_data, just use the builtin checks for substrings matching known obfuscated keys

Parameters:string_data (str) – The line to be parsed
Returns:The obfuscated line
Return type:str