bytecode

pydecipher.bytecode.create_opmap_from_file(file_path: PathLike) Dict[str, int]

Return an opcode map dictionary of OPNAME : OPCODE from a JSON file.

The JSON file must enumerate a complete opmap for the specified Python version. Even if only a few bytes have been swapped, all operations and opcodes must have a value for the version specified.

Parameters:

file_path (os.PathLike) –

The path to the JSON remapping file. This file must follow this format.

{
    "python_version": "<major>.<minor>(.<patch>)",
    "remapped_opcodes": [
        {
            "opcode": 1,
            "opname": "POP_TOP",
            "remapped_value": 5
        },
        {
            "opcode": 2,
            "opname": "ROT_TWO",
            "remapped_value": 4
        },
        ...

Returns:

A dictionary of OPNAME : OPCODE. For example:

{
    'POP_TOP': 5,
    'ROT_TWO': 4,
    ...
}

Return type:

Dict[str, int]

pydecipher.bytecode.create_pyc_header(magic_int: int, compilation_ts: int | datetime | None = None, file_size: int = 0) bytes

Return the header bytes necessary for creation of a compiled Python file.

Parameters:
  • magic_int (int) – The Python magic number that should be used in the header. This is also used to determine the Python version for which the header is being created, and consequently, the length and format of the header.

  • compilation_ts (Union[int, datetime], optional) – The compilation timestamp (if any) to put in the header.

  • file_size (int, optional) – The size of the source code, mod 2^32, to put in the header.

Returns:

The 8, 12, or 16 bytes of the header, depending on the compiled version for which the header was created.

Return type:

bytes

pydecipher.bytecode.decompile_pyc(arg_tuple: Tuple[Path, Dict[str, int], Dict[str, bool | PathLike]]) str

Decompile a single Python bytecode file.

Parameters:

arg_tuple (Tuple[pathlib.Path, Dict[str, int], Dict[str, Union[bool, os.PathLike]]]) –

A tuple containing the arguments for this function. This is a tuple because pebble’s Pool.map() function couldn’t pass multiple arguments to a subprocessed function call. The tuple entries correspond to the following arguments:

pyc_filepathlib.Path

The path to the compiled Python file

alternate_opmapDict[str, int], optional

If this bytecode file was produced by an interpreter with remapped opcodes, you must provide the opmap as a OPNAME: OPCODE dictionary

logging_options: Dict[str, Union[bool, os.PathLike], optional

A dictionary of logging options. This is only needed when pydecipher is performing multi-processed decompilation. The keys can be the following strings:

verbose: bool

True will enable verbose logging.

quiet: bool

True will silence all console logging.

log_path: pathlib.Path

If a path object is passed in as the log_path, the running instance of pydecipher will continue logging to that file.

Returns:

There are several different return values:

  • no_action: This file was not decompiled.

  • success: This file was successfully decompiled.

  • error: This file could not be decompiled 100% successfully.

  • opcode_error: The error message returned by uncompyle6 indicates this file may have remapped opcodes

Return type:

str

pydecipher.bytecode.diff_opcode(code_standard: code, code_remapped: code, version: str | None = None) Dict[int, Dict[int, int]]

Calculate remapped opcodes from two Code objects of the same sourcecode.

Parameters:
  • code_standard (Code (xdis.CodeX or types.CodeType)) – The standard-opcode Code object

  • code_remapped (Code (xdis.CodeX or types.CodeType)) – The remapped-opcode Code object

  • version (str, optional) – The Python version that marshaled the former two arguments. Used for figuring out what operations push arguments to the stack.

Returns:

A dictionary of original_opcode to Dict[replacement_opcode:replacement_count]. replacement_opcode is an opcode that was seen in place of original_opcode, and the replacement_count is the amount of times it was seen replacing the original_opcode throughout all the bytecode that was analyzed.

Return type:

Dict[int, Dict[int, int]]

Raises:

RuntimeError – Args aren’t correct type or differ in total opcode count too much.

pydecipher.bytecode.process_pycs(pyc_iterable: Iterable[PathLike], alternate_opmap: Dict[str, int] | None = None) None

Multi-processed decompilation orchestration of compiled Python files.

Currently, pydecipher uses uncompyle6 as its decompiler. It works well with xdis (same author) and allows for the decompilation of Code objects using alternate opmaps (with our extension of xdis).

This function will start up CPU count * 2 pydecipher processes to decompile the given Python. Attempts to check for debugger, in which case the decompilation will be single-threaded to make debugging easier.

Parameters:
  • pyc_iterable (Iterable[os.PathLike]) – An iterable of pathlib.Path objects, referencing compiled Python files to decompile.

  • alternate_opmap (Dict[str, int], optional) – An opcode map of OPNAME: OPCODE (i.e. ‘POP_TOP’: 1). This should be a complete opmap for the Python version of the files being decompiled. Even if only two opcodes were swapped, the opcode map passed in should contain all 100+ Python bytecode operations.

pydecipher.bytecode.validate_opmap(version: str, opmap: Dict[str, int]) bool

Validate whether opmap is correct/well-formed for the given version.

A well-formed opcode map should not have any duplicate keys or values, nor any missing or extraneous opnames or opcodes.

Parameters:
  • version (str) –

    Typically a string like ‘2.7’ or ‘3.8.1’. However, the version string can be any version accepted by xdis, including some weird alternate Python implementations like 2.7.1b3Jython or 3.5pypy.

  • opmap (Dict[str, int]) – A dictionary of OPERATION NAME: OPCODE VALUE.

Returns:

Whether or not this opcode map is valid and well-formed.

Return type:

bool

pydecipher.bytecode.version_str_to_magic_num_int(version_str: str) int

Given a Python version string, return it’s magic integer.

Parameters:

version_str (str) –

Typically a string like ‘2.7’ or ‘3.8.1’. However, the version string can be any version accepted by xdis, including some weird alternate Python implementations like 2.7.1b3Jython or 3.5pypy.

Returns:

The magic number corresponding to the version string.

Return type:

int