API

pgdumplib exposes a load method to create a Dump instance from a pg_dump file created in the custom format.

See the Examples page to see how to read a dump or create one.

pgdumplib.load(filepath, converter=None)[source]

Load a pg_dump file created with -Fd from disk

Parameters
Raises

ValueError

Return type

pgdumplib.dump.Dump

pgdumplib.new(dbname='pgdumplib', encoding='UTF8', converter=None, appear_as='12.0')[source]

Create a new pgdumplib.dump.Dump instance

Parameters
Return type

pgdumplib.dump.Dump

The Dump class exposes methods to load an existing dump, to add an entry to a dump, to add table data to a dump, to add blob data to a dump, and to save a new dump.

There are Converters that are available to format the data that is returned by read_data(). The converter is passed in during construction of a new Dump, and is also available as an argument to pgdumplib.load().

The default converter, DataConverter will return all fields as strings, only replacing NULL with None. The SmartDataConverter will attempt to convert all columns to native Python data types.

When loading or creating a dump, the table and blob data are stored in gzip compressed data files in a temporary directory that is automatically cleaned up when the Dump instance is released.

class pgdumplib.dump.Dump(dbname='pgdumplib', encoding='UTF8', converter=None, appear_as='12.0')[source]

Create a new instance of the Dump class

Once created, the instance of Dump can be used to read existing dumps or to create new ones.

Parameters
  • dbname (str) – The database name for the dump (Default: pgdumplib)

  • encoding (str) – The data encoding (Default: UTF8)

  • converter – The data converter class to use (Default: pgdumplib.converters.DataConverter)

add_entry(desc, namespace=None, tag=None, owner=None, defn=None, drop_stmt=None, copy_stmt=None, dependencies=None, tablespace=None, tableam=None, dump_id=None)[source]

Add an entry to the dump

The namespace and tag are required.

A ValueError will be raised if desc is not value that is known in :py:module:`pgdumplib.constants`.

The section is

When adding data, use table_data_writer() instead of invoking add_entry() directly.

If dependencies are specified, they will be validated and if a dump_id is specified and no entry is found with that dump_id, a ValueError will be raised.

Other omitted values will be set to the default values will be set to the defaults specified in the pgdumplib.dump.Entry class.

The dump_id will be auto-calculated based upon the existing entries if it is not specified.

Note

The creation of ad-hoc blobs is not supported.

Parameters
  • desc (str) – The entry description

  • namespace (str) – The namespace of the entry

  • tag (str) – The name/table/relation/etc of the entry

  • owner (str) – The owner of the object in Postgres

  • defn (str) – The DDL definition for the entry

  • drop_stmt (Optional[str]) – A drop statement used to drop the entry before

  • copy_stmt (Optional[str]) – A copy statement used when there is a corresponding data section.

  • dependencies (list) – A list of dump_ids of objects that the entry is dependent upon.

  • tablespace (str) – The tablespace to use

  • tableam (str) – The table access method

  • dump_id (int) – The dump id, will be auto-calculated if left empty

Raises

ValueError

Return type

pgdumplib.dump.Entry

blobs()[source]

Iterator that returns each blob in the dump

Return type

tuple(int, bytes)

get_entry(dump_id)[source]

Return the entry for the given dump_id

Parameters

dump_id (int) – The dump ID of the entry to return.

Return type

pgdumplib.dump.Entry or None

load(path)[source]

Load the Dumpfile, including extracting all data into a temporary directory

Parameters

path (os.PathLike) – The path of the dump to load

Raises

RuntimeError

Raises

ValueError

Return type

Dump

lookup_entry(desc, namespace, tag)[source]

Return the entry for the given namespace and tag

Parameters
  • desc (str) – The desc / object type of the entry

  • namespace (str) – The namespace of the entry

  • tag (str) – The tag/relation/table name

  • section (str) – The dump section the entry is for

Raises

ValueError

Return type

pgdumplib.dump.Entry or None

save(path)[source]

Save the Dump file to the specified path

Parameters

path (os.PathLike) – The path to save the dump to

Return type

NoReturn

table_data(namespace, table)[source]

Iterator that returns data for the given namespace and table

Parameters
  • namespace (str) – The namespace/schema for the table

  • table (str) – The table name

Raises

pgdumplib.exceptions.EntityNotFoundError

Return type

Generator[Union[str, Tuple[Any, …]], None, None]

table_data_writer(entry, columns)[source]

A context manager that is used to return a TableData instance, which can be used to add table data to the dump.

When invoked for a given entry containing the table definition,

Parameters
  • entry (Entry) – The entry for the table to add data for

  • columns (list or tuple) – The ordered list of table columns

Return type

TableData

property version

Return the version as a tuple to make version comparisons easier.

Return type

tuple

class pgdumplib.dump.Entry(dump_id: int, had_dumper: bool = False, table_oid: str = '0', oid: str = '0', tag: Optional[str] = None, desc: Optional[str] = None, defn: Optional[str] = None, drop_stmt: Optional[str] = None, copy_stmt: Optional[str] = None, namespace: Optional[str] = None, tablespace: Optional[str] = None, tableam: Optional[str] = None, owner: Optional[str] = None, with_oids: bool = False, dependencies: List[int] = <factory>, data_state: int = 3, offset: int = 0)[source]

The entry model represents a single entry in the dataclass

Custom formatted dump files are primarily comprised of entries, which contain all of the metadata and DDL required to construct the database.

For table data and blobs, there are entries that contain offset locations in the dump file that instruct the reader as to where the data lives in the file.

Variables
  • dump_id (int) – The dump id, will be auto-calculated if left empty

  • had_dumper (bool) – Indicates

  • oid (str) – The OID of the object the entry represents

  • tag (str) – The name/table/relation/etc of the entry

  • desc (str) – The entry description

  • defn (str) – The DDL definition for the entry

  • drop_stmt (str) – A drop statement used to drop the entry before

  • copy_stmt (str) – A copy statement used when there is a corresponding data section.

  • namespace (str) – The namespace of the entry

  • tablespace (str) – The tablespace to use

  • tableam (str) – The table access method

  • owner (str) – The owner of the object in Postgres

  • with_oids (bool) – Indicates …

  • dependencies (list) – A list of dump_ids of objects that the entry is dependent upon.

  • data_state (int) – Indicates if the entry has data and how it is stored

  • offset (int) – If the entry has data, the offset to the data in the file

  • section (str) – The section of the dump file the entry belongs to

property section

Return the section the entry belongs to

Return type

str

class pgdumplib.dump.TableData(dump_id, tempdir, encoding)[source]

Used to encapsulate table data using temporary file and allowing for an API that allows for the appending of data one row at a time.

Do not create this class directly, instead invoke table_data_writer().

append(*args)[source]

Append a row to the table data, passing columns in as args

Column order must match the order specified when table_data_writer() was invoked.

All columns will be coerced to a string with special attention paid to None, converting it to the null marker (\N) and datetime.datetime objects, which will have the proper pg_dump timestamp format applied to them.

Return type

NoReturn

finish()[source]

Invoked prior to saving a dump to close the temporary data handle and switch the class into read-only mode.

For use by pgdumplib.dump.Dump only.

Return type

NoReturn

read()[source]

Read the data from disk for writing to the dump

For use by pgdumplib.dump.Dump only.

Return type

bytes

property size

Return the current size of the data on disk

Return type

int