commonutil.sanitize

非法字元清除輔助函式庫 / Routines perform text sanitize

ALPHANUMBERDASHUL_CHARCTERS = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_'

大小寫英文字母、數字、減號、底線 / Upper and lower case of alphabets, digits, dash and underline

ALPHANUMBER_CHARCTERS = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'

大小寫英文字母與數字 / Upper and lower case of alphabets and digits

DWHITE_CHARACTERS = '. \t\n\r\x0b\x0c\x00'

應被視為空白字元由檔名前後去除的字元集 (常用於進行檔名的 strip 處理,比單純的空白字元集多了點號) / Characters treat as spaces in file name

WHITE_CHARACTERS = ' \t\n\r\x0b\x0c\x00'

應被視為空白字元的字元集 (常用於進行 strip 處理) / Characters treat as spaces in general cases

path_component(original_name)[source]

將給定檔案或資料夾名稱中的不允許字元轉換成底線 (_) 並去除在頭或尾的點 (.) 與空白 (\ \t\r\n) 不允許字元包含 \/*?:<>|\t\n\r (不包含空白).

Replace illegal characters in given purposed file name with underline. Spaces and dots at beginning and end of purposed name will be stripped.

Parameters:original_name – 原訂檔名 / Purposed file name
Returns:重組後檔名 / Sanitized file name
to_identifier(val, allowed_start_characters='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_', allowed_characters='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_', replace_char='_', strip_characters='. \t\n\r\x0b\x0c\x00')[source]

將給定字串一允許字元規則轉換成識別字串

Sanitize given string to an identifier string which contains only allowed characters

Parameters:
  • val – 要處理的字串 / String to be sanitized
  • allowed_start_characters=IDENTIFIER_START_CHARCTERS – 第一個字元允許的字元 / Allowed characters for 1st character in string
  • allowed_characters=IDENTIFIER_CHARCTERS – 允許的字元 / Allowed characters for rest of string
  • replace_char="_" – 非允許字元要代換成的字元 / Character to replace illegal characters
  • strip_characters=None – 如指定則檢查前進行 strip() 操作 / Characters to be strip from begin and end of given string
Returns:

重組後的識別字串 / Sanitized identifier string

via_allowed(val, allowed_characters='0123456789', replace_char='_', strip_characters=None)[source]

將字串中未列在允許字元的字元代換成指定字元

Replace characters in given string which does not included in white list with given character.

Characters at beginning and end of given string will be stripped if which is included in strip character list.

Parameters:
  • val – 要處理的字串 / String to be sanitized
  • allowed_characters="0123456789" – 允許的字元 / Allowed characters
  • replace_char="_" – 非允許字元要代換成的字元 / Character to replace illegal characters
  • strip_characters=None – 如指定則檢查前進行 strip() 操作 / Characters to be strip from begin and end of given string
Returns:

重組後的字串 / Sanitized string