fmts
⚓︎
String utils
Classes:
Functions:
-
anystr
–Convert a given function to accept any string type
-
anystr2anystr
–Convert a str-to-str function to allow for
AnyStr
-
b64_html_gif
–Return a HTML base64 gif image tag
-
b64_html_img
–Return an img tag given a base64-jpeg-image-string
-
b64_html_jpg
–Return a HTML base64 jpg image tag
-
b64_html_png
–Return a HTML base64 png image tag
-
base64_jpg_html
–Return an img tag given a base64-jpeg-image-string
-
binstr
–Convert an integer to a binary string
-
body_contents
–Parse the innertext for body tags in an html string
-
bytes2str
–Convert bytes to a string
-
camel2kebab
–Convert a given camelCase string to kebab-case
-
camel2pascal
–Convert a given camelCase string to PascalCase
-
camel2snake
–Convert a 'camelCase' string to a 'snake_case' string
-
camel_characters_set
–Return a set of all the characters that are allowed in camel case
-
carrots
–Add carrots on a line below given a string
-
dedent
–Dedent a string
-
dos2unix
–Replace CRLF line endings with LF line endings for a given string
-
dseconds
–Format time duration given initial and final timestamps in seconds
-
ensure_trailing_newline
–Return a string that has only one trailing new line
-
ensure_utf8
–Return a string that ensured to be utf-8.
-
enum_strings
–Return a generator with enumerated strings
-
filesize_str
–Get the human readable filesize string for a file given its path
-
indent
–Indent a string a given number of spaces
-
is_identifier
–Return True if a string is a valid python identifier; False otherwise
-
is_snake
–Check if a given string is snake_case
-
isidentifier
–Return True if a string is a valid python identifier; False otherwise
-
kebab2camel
–Convert a given kebab-case string to camelCase
-
kebab2pascal
–Convert a given kebab-case string to PascalCase
-
kebab2snake
–Convert a given kebab-case string to snake_case
-
kebab_characters_set
–Return a set of all the characters that are allowed in camel case
-
long_timestamp_string
–Return a 'long-form' timestamp string given epoch-seconds float
-
longest_line
–Return the length of the longest line in a string
-
multi_replace
–Replace multiple patterns in a string
-
nbytes_str
–Format nbytesber of bytes to human readable form
-
nseconds
–Format a number of seconds as a human readable string
-
overscore
–Add underscores on a line above the given string
-
overscore_carrots
–Add underscores on a line above a string and carrots on a line below
-
pascal2camel
–Convert a 'PascalCase' string to a 'camelCase' string
-
pascal2kebab
–Convert a given PascalCase string to kebab-case
-
pascal2snake
–Convert a given PascalCase string to snake_case
-
printable_characters_set
–Return set of all printable characters
-
randhexstr
–Return a random hex string
-
random_string
–Return a random ascii string (length=str_len; default=4)
-
rm_b
–Remove the b'' from binary strings and sub-strings that contain b''
-
rm_character
–Remove a character in a string globally
-
rm_dunderscore
–Replace n>=2 underscores with a single underscore
-
rm_multilines
–Remove blank lines from a string
-
rm_u
–Remove the u'' from unicode strings and sub-strings that contain u''
-
rm_whitespace
–Replace n>=2 spaces with a single underscore
-
snake2camel
–Convert a given snake_case string to camelCase
-
snake2kebab
–Convert a given snake_case string to kebab-case
-
snake2pascal
–Convert a given snake_case string to PascalCase
-
space_pad_strings
–Space pads strings to match the string with the max length
-
string_sanitize
–Clean up a string
-
strip_ascii
–Remove all ascii characters from a string
-
strip_comments
–Remove comments from python/shell scripts given the script as a string
-
strip_non_ascii
–Remove all ascii characters from a string
-
striterable
–Yield 'clean' sub-strings from an input string
-
timestamp
–Time stamp string w/ format yyyymmdd-HHMMSS
-
truncate_string
–Truncate a string at either a max number of lines or characters
-
udiff
–Return universal-diff as a string
HTML
⚓︎
HTML formatting utils staticmethod container
Methods:
-
html_tag
–Return an HTML tag with a string as the innerHTML
-
table
–Return an string surrounded with 'table-HTML' tags
-
tablebody
–Return an string surrounded with 'tbody-HTML' tags
-
tablehead
–Return an string surrounded with 'thead-HTML' tags
-
tbody
–Return an string surrounded with 'tbody-HTML' tags
-
td
–Return an string surrounded with 'td-HTML' tags
-
th
–Return an string surrounded with 'th-HTML' tags
-
thead
–Return an string surrounded with 'thead-HTML' tags
-
tr
–Return an string surrounded with 'tr-HTML' tags
anystr
⚓︎
Convert a given function to accept any string type
Parameters:
Returns:
Examples:
>>> def _is_upper(string: str) -> bool:
... return string == string.upper()
>>> is_upper = anystr(_is_upper)
>>> is_upper('hello')
False
>>> is_upper('HELLO')
True
>>> is_upper(b'hello')
False
>>> is_upper(b'HELLO')
True
anystr2anystr
⚓︎
b64_html_gif
⚓︎
b64_html_gif(b64_string: str) -> str
b64_html_img
⚓︎
Return an img tag given a base64-jpeg-image-string
b64_html_jpg
⚓︎
b64_html_jpg(b64_string: str) -> str
b64_html_png
⚓︎
b64_html_png(b64_string: str) -> str
base64_jpg_html
⚓︎
Return an img tag given a base64-jpeg-image-string
binstr
⚓︎
body_contents
⚓︎
body_contents(html_string: str) -> List[str]
bytes2str
⚓︎
camel2kebab
⚓︎
Convert a given camelCase string to kebab-case
Examples:
>>> camel2kebab('camelCase')
'camel-case'
>>> camel2kebab(b'camelCase')
b'camel-case'
camel2pascal
⚓︎
Convert a given camelCase string to PascalCase
Examples:
>>> camel2pascal('camelCase')
'CamelCase'
>>> camel2pascal(b'camelCase')
b'CamelCase'
camel2snake
⚓︎
camel_characters_set
cached
⚓︎
Return a set of all the characters that are allowed in camel case
Examples:
>>> CAMEL_CHARACTERS
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_'
>>> camel_characters_set() == {c for c in CAMEL_CHARACTERS}
True
carrots
⚓︎
dedent
⚓︎
Dedent a string
Parameters:
Returns:
-
str
–Unindented string
Examples:
>>> s = ' this is a string'
>>> dedent(s)
'this is a string'
>>> s = ' this is a string'
>>> dedent(s)
' this is a string'
>>> s = " this is a string\n with 2 lines"
>>> print(dedent(s))
this is a string
with 2 lines
dos2unix
⚓︎
Replace CRLF line endings with LF line endings for a given string
Examples:
>>> dos2unix('hello\r\nworld')
'hello\nworld'
>>> type(b'hello\r\nworld')
<class 'bytes'>
>>> dos2unix(b'hello\r\nworld')
b'hello\nworld'
>>> type(dos2unix(b'hello\r\nworld'))
<class 'bytes'>
dseconds
⚓︎
Format time duration given initial and final timestamps in seconds
Parameters:
-
ti
⚓︎Union[float, int]
) –Initial time in seconds
-
tf
⚓︎Union[float, int]
) –Final time in seconds
Returns:
-
str
(str
) –Formatted time duration string readable by humans
Examples:
Less than or equal to one second
>>> for i in range(0, 10, 2):
... (10**-i, dseconds(0, 10**(-i)))
...
(1, '1.000 sec')
(0.01, '10.000 ms')
(0.0001, '100.000 μs')
(1e-06, '1.000 μs')
(1e-08, '10.000 ns')
Greater than or equal to one second
>>> for i in range(0, 5):
... (f"{10**i} seconds", dseconds(0, 10**i))
...
('1 seconds', '1.000 sec')
('10 seconds', '10.000 sec')
('100 seconds', '01:40 (mm:ss)')
('1000 seconds', '16:40 (mm:ss)')
('10000 seconds', '02:46:40 (hh:mm:ss)')
ensure_trailing_newline
⚓︎
Return a string that has only one trailing new line
Examples:
>>> ensure_trailing_newline("foo")
'foo\n'
>>> ensure_trailing_newline('foo\n')
'foo\n'
>>> ensure_trailing_newline("foo\n\n")
'foo\n'
>>> ensure_trailing_newline(b'foo')
b'foo\n'
>>> ensure_trailing_newline(b'foo\n')
b'foo\n'
>>> ensure_trailing_newline(b'foo\n\n')
b'foo\n'
ensure_utf8
⚓︎
Return a string that ensured to be utf-8.
This is often needed for those rare cases where some weird non-unicode character or escape sequence is present within a string; This method protects against the possibility of a UnicodeDecodeError.
Parameters:
Returns:
-
str
–The unicode-encoded version of a string
Examples:
>>> ensure_utf8('hello')
'hello'
>>> ensure_utf8(b'hello')
'hello'
>>> latin_bytes_with_weird_characters = b'hello\xc3\xa9'
>>> latin_string = ensure_utf8(latin_bytes_with_weird_characters)
>>> latin_string
'helloé'
>>> problem_bytes = b'hello\x00'
>>> problem_bytes_utf8 = ensure_utf8(problem_bytes)
>>> problem_bytes_utf8
'hello\x00'
>>> ensure_utf8('hello\x00')
'hello\x00'
>>> ensure_utf8(b'hello\x00')
'hello\x00'
enum_strings
⚓︎
filesize_str
⚓︎
Get the human readable filesize string for a file given its path
Parameters:
Returns:
-
str
(str
) –Size of the file
Raises:
-
FileNotFoundError
–If the given fspath does not exist
Examples:
>>> fspath = "filesize_str.doctest.txt"
>>> with open(fspath, "w") as f:
... f.write("dummy file with some stuff")
26
>>> filesize_str(fspath)
'26.0 bytes'
>>> from os import remove; remove(fspath)
indent
⚓︎
indent(
string: AnyStr,
prefix: str = " ",
predicate: Optional[Callable[[str], bool]] = None,
) -> AnyStr
Indent a string a given number of spaces
Parameters:
-
string
⚓︎AnyStr
) –string to indent
-
prefix
⚓︎str
, default:' '
) –prefix to use for indentation; defaults to 4 spaces
-
predicate
⚓︎Optional[Callable[[str], bool]]
, default:None
) –Optional predicate to determine whether to indent a line;
Returns:
-
AnyStr
–Indented string
Examples:
>>> s = "this is a string"
>>> indent(s)
' this is a string'
>>> s = "this is a\nmultiline string"
>>> indent(s)
' this is a\n multiline string'
>>> print(indent(s))
this is a
multiline string
>>> s = "this is a\nmultiline string"
>>> print(indent(s, ' '))
this is a
multiline string
>>> indent(b'this is a string')
b' this is a string'
is_dunder
⚓︎
is_identifier
⚓︎
Return True if a string is a valid python identifier; False otherwise
Parameters:
Returns:
-
bool
(bool
) –True if is an identifier
Examples:
>>> is_identifier("herm")
True
>>> is_identifier("something that contains spaces")
False
>>> is_identifier("import")
False
>>> is_identifier("something.with.periods")
False
>>> is_identifier("astring-with-dashes")
False
>>> is_identifier("astring_with_underscores")
True
>>> is_identifier(b"astring_with_underscores")
True
>>> is_identifier(123)
False
is_snake
⚓︎
isidentifier
⚓︎
Return True if a string is a valid python identifier; False otherwise
Parameters:
Returns:
-
bool
(bool
) –True if is an identifier
Examples:
>>> isidentifier("herm")
True
>>> isidentifier("something that contains spaces")
False
>>> isidentifier("import")
False
>>> isidentifier("something.with.periods")
False
>>> isidentifier("astring-with-dashes")
False
>>> isidentifier("astring_with_underscores")
True
>>> isidentifier(123)
False
kebab2camel
⚓︎
Convert a given kebab-case string to camelCase
Examples:
>>> kebab2camel('kebab-case')
'kebabCase'
>>> kebab2camel(b'kebab-case')
b'kebabCase'
kebab2pascal
⚓︎
Convert a given kebab-case string to PascalCase
Examples:
>>> kebab2pascal('kebab-case')
'KebabCase'
>>> kebab2pascal(b'kebab-case')
b'KebabCase'
kebab2snake
⚓︎
kebab_characters_set
cached
⚓︎
Return a set of all the characters that are allowed in camel case
Examples:
>>> KEBAB_CHARACTERS
'abcdefghijklmnopqrstuvwxyz0123456789-'
>>> kebab_characters_set() == {c for c in KEBAB_CHARACTERS}
True
long_timestamp_string
⚓︎
Return a 'long-form' timestamp string given epoch-seconds float
longest_line
⚓︎
Return the length of the longest line in a string
Parameters:
Returns:
-
int
(int
) –The length of the longest line in the string
Examples:
>>> longest_line('hello\nworld')
5
>>> longest_line(b'hello\nworld')
5
>>> longest_line('hello world')
11
>>> longest_line(b'hello world')
11
multi_replace
⚓︎
multi_replace(
string: str,
replacements: Union[
List[Tuple[str, str]],
List[List[str]],
Dict[str, str],
ItemsView[str, str],
],
) -> str
Replace multiple patterns in a string
Parameters:
-
string
⚓︎str
) –Strint to apply
-
replacements
⚓︎Union[List[Tuple[str, str]], List[List[str]], Dict[str, str], ItemsView[str, str]]
) –Replacement combos
Returns:
-
str
(str
) –Input string with all the replacements applied in order
Examples:
Works with a list of lists!
>>> replacements = [['hello', 'goodbye'], ['world', 'earth']]
>>> multi_replace('hello, world', replacements)
'goodbye, earth'
Works with a list of tuples!
>>> replacements = [('hello', 'goodbye'), ('world', 'earth')]
>>> multi_replace('hello, world', replacements)
'goodbye, earth'
Works with a dictionary where all keys and values are strings!
>>> replacements = {'hello': 'goodbye', 'world': 'earth'}
>>> multi_replace('hello, world', replacements)
'goodbye, earth'
>>> replacements = [['hello', 'goodbye'], ['world', 'this', 'will', 'fail']]
>>> try:
... multi_replace('hello, world', replacements)
... except ValueError:
... print('ValueError raised')
ValueError raised
nbytes
⚓︎
Alias for nbytes_str (for backward compatibility)
Examples:
>>> nbytes(100)
'100.0 bytes'
>>> nbytes(1000)
'1000.0 bytes'
>>> nbytes(10000)
'9.8 KB'
>>> nbytes(100000)
'97.7 KB'
>>> nbytes(1000000)
'976.6 KB'
>>> nbytes(10_000_000)
'9.5 MB'
>>> nbytes(100_000_000)
'95.4 MB'
>>> nbytes(1000000000)
'953.7 MB'
>>> nbytes(10000000000)
'9.3 GB'
>>> nbytes(100000000000)
'93.1 GB'
>>> nbytes(1000000000000)
'931.3 GB'
>>> nbytes(10000000000000)
'9.1 TB'
>>> nbytes(100000000000000)
'90.9 TB'
nbytes_str
⚓︎
Format nbytesber of bytes to human readable form
Parameters:
Returns:
-
str
(str
) –nbytesber of bytes formatted
Raises:
-
ValueError
–If given number of bytes is invalid/negative
Examples:
>>> nbytes_str(100)
'100.0 bytes'
>>> nbytes_str(1000)
'1000.0 bytes'
>>> nbytes_str(10000)
'9.8 KB'
>>> nbytes_str(100000)
'97.7 KB'
>>> nbytes_str(1000000)
'976.6 KB'
>>> nbytes_str(10_000_000)
'9.5 MB'
>>> nbytes_str(100_000_000)
'95.4 MB'
>>> nbytes_str(1000000000)
'953.7 MB'
>>> nbytes_str(10000000000)
'9.3 GB'
>>> nbytes_str(100000000000)
'93.1 GB'
>>> nbytes_str(1000000000000)
'931.3 GB'
>>> nbytes_str(10000000000000)
'9.1 TB'
>>> nbytes_str(100000000000000)
'90.9 TB'
>>> nbytes_str(-100000000000)
'-93.1 GB'
nseconds
⚓︎
Format a number of seconds as a human readable string
Formats nsec if t2 is None as a string; Calculates the time and formats the time t2-nsec if t2 is not None.
Parameters:
Returns:
-
str
(str
) –Formatted time duration string readable by humans
Examples:
Less than or equal to one second
>>> for i in range(0, 10, 2):
... (10**-i, nseconds(10**(-i)))
...
(1, '1.000 sec')
(0.01, '10.000 ms')
(0.0001, '100.000 μs')
(1e-06, '1.000 μs')
(1e-08, '10.000 ns')
Greater than or equal to one second
>>> for i in range(0, 5):
... (f"{10**i} seconds", nseconds(10**i))
...
('1 seconds', '1.000 sec')
('10 seconds', '10.000 sec')
('100 seconds', '01:40 (mm:ss)')
('1000 seconds', '16:40 (mm:ss)')
('10000 seconds', '02:46:40 (hh:mm:ss)')
>>> nseconds(-60)
'01:00 (mm:ss)'
>>> nseconds(60)
'01:00 (mm:ss)'
>>> nseconds(0)
'0 sec'
overscore
⚓︎
overscore_carrots
⚓︎
Add underscores on a line above a string and carrots on a line below
Parameters:
Returns:
-
str
–'Over-scored' and carroted string
Examples:
>>> overscore_carrots("A TITLE")
'_______\nA TITLE\n^^^^^^^'
>>> print(overscore_carrots("A TITLE"))
_______
A TITLE
^^^^^^^
pascal2camel
⚓︎
pascal2kebab
⚓︎
Convert a given PascalCase string to kebab-case
Examples:
>>> pascal2kebab('PascalCase')
'pascal-case'
>>> pascal2kebab(b'PascalCase')
b'pascal-case'
pascal2snake
⚓︎
Convert a given PascalCase string to snake_case
Examples:
>>> pascal2snake('PascalCase')
'pascal_case'
>>> pascal2snake(b'PascalCase')
b'pascal_case'
printable_characters_set
cached
⚓︎
randhexstr
⚓︎
Return a random hex string
Parameters:
Returns:
-
str
(str
) –random hex string
Examples:
>>> a = randhexstr()
>>> isinstance(a, str)
True
>>> len(a)
8
>>> b = randhexstr(10)
>>> isinstance(b, str)
True
>>> len(b)
10
>>> randhexstr(0)
Traceback (most recent call last):
...
ValueError: length must be a positive even number
random_string
⚓︎
Return a random ascii string (length=str_len; default=4)
Examples:
>>> a = random_string()
>>> isinstance(a, str)
True
>>> len(a)
8
>>> a = random_string(12)
>>> isinstance(a, str)
True
>>> len(a)
12
>>> a = random_string(8, hex=True)
>>> isinstance(a, str)
True
>>> len(a)
8
rm_b
⚓︎
Remove the b'' from binary strings and sub-strings that contain b''
Taken from 'pupy' (Pretty Useful Python (which jesse wrote))
Parameters:
Returns:
-
str
(str
) –A string without binary b'' quotes surround it
Examples:
>>> rm_b("b'a_string'")
'a_string'
rm_character
⚓︎
rm_dunderscore
⚓︎
rm_multilines
⚓︎
rm_u
⚓︎
rm_whitespace
⚓︎
Replace n>=2 spaces with a single underscore
Parameters:
-
join_str
⚓︎str
, default:' '
) –String to join on; defaults to a space (' ')
-
string
⚓︎str
) –String to remove spaces from
Returns:
-
str
–String with no spaces and underscores where there were spaces
Examples:
>>> rm_whitespace('there are lots of spaces')
'there are lots of spaces'
>>> rm_whitespace('there are lots of spaces', join_str='_')
'there_are_lots_of_spaces'
snake2camel
⚓︎
snake2kebab
⚓︎
snake2pascal
⚓︎
space_pad_strings
⚓︎
Space pads strings to match the string with the max length
Returns:
Examples:
>>> space_pad_strings(["a", "bb", "ccc"])
['a ', 'bb ', 'ccc']
>>> space_pad_strings(["a", "bb", "ccc"], justify='right')
[' a', ' bb', 'ccc']
>>> space_pad_strings(["a", "bb", "ccc"], justify='center')
Traceback (most recent call last):
...
ValueError: justify must be 'left' or 'right', not center; case-insensitive
str_is_identifier
⚓︎
string_sanitize
⚓︎
strip_ascii
⚓︎
strip_comments
⚓︎
Remove comments from python/shell scripts given the script as a string
Parameters:
Returns:
-
str
(str
) –input string with comments striped out
Examples:
Here is an example of stripping comments from a python-ish script:
>>> python_script_ish = r'''# some encoding
... # this is a comment
... # this is another comment
... print('hello bob')
... print('hello bobert') # bob is short for bobert
... '''
>>> a = strip_comments(python_script_ish)
>>> a.splitlines(keepends=False)
['', '', '', "print('hello bob')", "print('hello bobert') "]
Here is an example of stripping comments from a bash/shell-ish script:
>>> bash_script_ish = r'''#!/bin/bash
... # this is a comment
... # this is another comment
... echo "hello"
... echo "hello again" # comment
... '''
>>> a = strip_comments(bash_script_ish)
>>> a.splitlines(keepends=False)
['', '', '', 'echo "hello"', 'echo "hello again" ']
strip_non_ascii
⚓︎
striterable
⚓︎
Yield 'clean' sub-strings from an input string
This method takes a string (like the string that would be a dat file) and yields strings from that string separated by some delimiter.
Delimiters
(unix AND dos!)
Parameters:
Returns:
Examples:
Simple spaces example:
>>> string_w_spaces = 'this is a string with spaces'
>>> list(striterable(string_w_spaces))
['this', 'is', 'a', 'string', 'with', 'spaces']
Leading and trailing spaces example:
>>> string_w_spaces = ' this is a string with spaces '
>>> list(striterable(string_w_spaces))
['this', 'is', 'a', 'string', 'with', 'spaces']
Tabs example:
>>> strings = ['string', 'separated', 'by', 'tabs']
>>> tab_separated = '\t'.join(strings)
>>> list(striterable(tab_separated))
['string', 'separated', 'by', 'tabs']
t9
⚓︎
t9_str
⚓︎
timestamp
⚓︎
Time stamp string w/ format yyyymmdd-HHMMSS
Parameters:
Returns:
-
str
–timestamp string
Examples:
>>> from datetime import datetime
>>> stamps = ['20190225-161151', '20190225-081151']
>>> timestamp(1551111111.111111) in stamps
True
>>> datetime.now().strftime("%Y%m%d-%H%M%S") == timestamp()
True
>>> timestamp(datetime.now()) == timestamp()
True
truncate_string
⚓︎
Truncate a string at either a max number of lines or characters
Parameters:
-
string
⚓︎str
) –String to truncate
-
maxlines
⚓︎int
, default:120
) –Max number of lines the truncated string can have; default is 120
-
max_characters
⚓︎int
, default:4096
) –Max number of characters the string can have; default is 4096
Returns:
-
str
–Truncated string
Examples:
>>> truncate_string('a')
'a'
>>> print(truncate_string('a\n' * 10, maxlines=5))
a
a
a
a
a
---------------------------
... Truncated @ 5 lines...
---------------------------
udiff
⚓︎
udiff(
a_lines: Sequence[str],
b_lines: Sequence[str],
fromfile: str = "A",
tofile: str = "B",
n: int = 0,
maxlines: int = 120,
max_characters: int = 4096,
) -> str
Return universal-diff as a string
Parameters:
-
a_lines
⚓︎Sequence[str]
) –First set of lines as strings
-
b_lines
⚓︎Sequence[str]
) –Second set of lines as strings
-
fromfile
⚓︎str
, default:'A'
) –Name or label of the first file/lines (Default = 'A')
-
tofile
⚓︎str
, default:'B'
) –Name or label of the second file/lines (Default = 'B')
-
n
⚓︎int
, default:0
) –Number of context lines to give in diff (Default = 0)
-
maxlines
⚓︎int
, default:120
) –Number of diff lines to truncate at (Default = 120)
-
max_characters
⚓︎int
, default:4096
) –Number of characters to truncate at (Default = 4096)
Returns:
-
str
–universal diff string that is truncated if too long