podium.storage.resources package

Submodules

podium.storage.resources.downloader module

Module downloader offers classes for downloading files from the given uri. It is consisted from base class BaseDownloader that every downloader implements. Special class of downloaders are downloaders that use HTTP protocol, their base class is HTTPDownloader and its simple implementation is SimpleHttpDownloader.

class podium.storage.resources.downloader.BaseDownloader

Bases: abc.ABC

BaseDownloader interface for downloader classes.

abstract classmethod download(uri, path, overwrite=False, **kwargs)

Function downloades file from given URI to given path. If the overwrite variable is true and given path already exists it will be overwriten with new file.

Parameters
  • uri (str) – URI of file that needs to be downloaded

  • path (str) – destination path where to save downloaded file

  • overwrite (bool) – if true and given path exists downloaded file will overwrite existing files

Returns

rewrite_status – True if download was successful or False if the file already exists and given overwrite value was False.

Return type

bool

Raises
  • ValueError – if given uri or path are None

  • RuntimeError – if there was an error while obtaining resource from uri

class podium.storage.resources.downloader.HttpDownloader

Bases: podium.storage.resources.downloader.BaseDownloader

Interface for downloader that uses http protocol for data transfer.

class podium.storage.resources.downloader.SCPDownloader

Bases: podium.storage.resources.downloader.BaseDownloader

Class for downloading file from server using sftp on top of ssh protocol.

USER_NAME_KEY

key for defining keyword argument for username

Type

str

PASSWORD_KEY

key for defining keyword argument for password if the private key file uses paraphrase, user should define it here

Type

str, optional

HOST_ADDR_KEY

key for defining keyword argument for remote host address

Type

str

PRIVATE_KEY_FILE_KEY

key for defining keyword argument for private key location if the user uses default linux private key location this argument can be set to None

Type

str, optional

classmethod download(uri, path, overwrite=False, **kwargs)

Method downloads a file from the remote machine and saves it to the local path. If the overwrite variable is true and given path already exists it will be overwriten with new file.

Parameters
  • uri (str) – URI of the file on remote machine

  • path (str) – path of the file on local machine

  • overwrite (bool) – if true and given path exists downloaded file will overwrite existing files

  • kwargs (dict(str, str)) – key word arguments that are described in class attributes used for connecting to the remote machine

Returns

rewrite_status – True if download was successful or False if the file already exists and given overwrite value was False.

Return type

bool

Raises
  • ValueError – If given uri or path are None, or if the host is not defined.

  • RuntimeError – If there was an error while obtaining resource from uri.

class podium.storage.resources.downloader.SimpleHttpDownloader

Bases: podium.storage.resources.downloader.HttpDownloader

Downloader that uses HTTP protocol for downloading. It doesn’t offer content confirmation (as needed for example in google drive) or any kind of authentication.

classmethod download(uri, path, overwrite=False, **kwargs)

Function downloades file from given URI to given path. If the overwrite variable is true and given path already exists it will be overwriten with new file.

Parameters
  • uri (str) – URI of file that needs to be downloaded

  • path (str) – destination path where to save downloaded file

  • overwrite (bool) – if true and given path exists downloaded file will overwrite existing files

Returns

rewrite_status – True if download was successful or False if the file already exists and given overwrite value was False.

Return type

bool

Raises
  • ValueError – if given uri or path are None

  • RuntimeError – if there was an error while obtaining resource from uri

podium.storage.resources.large_resource module

Module contains class for defining large resource. Classes that contain large resources that should be downloaded should use this module.

class podium.storage.resources.large_resource.LargeResource(**kwargs)

Bases: object

Large resource that needs to download files from URL. Class also supports archive decompression.

BASE_RESOURCE_DIR

base large files directory path

Type

str

RESOURCE_NAME

key for defining resource directory name parameter

Type

str

URL

key for defining resource url parameter

Type

str

ARCHIVE

key for defining archiving method paramter

Type

str

SUPPORTED_ARCHIVE

list of supported archive file types

Type

list(str)

class podium.storage.resources.large_resource.SCPLargeResource(**kwargs)

Bases: podium.storage.resources.large_resource.LargeResource

Large resource that needs to download files from URI using scp protocol. For other functionalities class uses Large Resource class.

SCP_HOST_KEY

key for keyword argument that defines remote host address

Type

str

SCP_USER_KEY

key for keyword argument that defines remote host username

Type

str

SCP_PASS_KEY

key for keyword argument that defines remote host password or passphrase used in private key

Type

str, optional

SCP_PRIVATE_KEY

key for keyword argument that defines location for private key on linux OS it can be optional if the key is in default location

Type

str, optional

podium.storage.resources.large_resource.init_scp_large_resource_from_kwargs(resource, uri, archive, scp_host, user_dict)

Method initializes scp resource from resource informations and user credentials

Parameters
  • resource (str) – resource name, same as LargeResource.RESOURCE_NAME

  • uri (str) – resource uri, same as LargeResource.URI

  • archive (str) – archive type, see LargeResource.ARCHIVE

  • scp_host (str) – remote host adress, see SCPLargeResource.SCP_HOST_KEY

  • user_dict (dict(str, str)) – user dictionary that may contain scp_user that defines username, scp_private_key that defines path to private key, scp_pass_key that defines user password

podium.storage.resources.util module

Module contains storage utility methods.

podium.storage.resources.util.copyfileobj_with_tqdm(finput, foutput, total_size, buffer_size=16384)

Function copies file like input finput to file like output foutput. Total size is used to display progress bar and buffer size to determine size of the buffer used for copying. The implementation is based on shutil.copyfileobj.

Parameters
  • finput (file like object) – input object from which to copy the data

  • foutput (file like object) – output object to which the data is copied

  • total_size (int) – total input file size used for computing progress and displaying progress bar

  • buffer_size (int) – constant used for determining maximal buffer size

podium.storage.resources.util.extract_tar_file(archive_file, destination_dir, encoding='uft-8')

Method extracts tar archive to destination, including those archives that are created using gzip, bz2 and lzma compression.

Parameters
  • archive_file (str) – path to the archive file that needs to be extracted

  • destination_dir (str) – path where file needs to be decompressed

Raises

ValueError – If given archive file doesn’t exist.

podium.storage.resources.util.extract_zip_file(archive_file, destination_dir)

Method extracts zip archive to destination.

Parameters
  • archive_file (str) – path to the archive file that needs to be extracted

  • destination_dir (str) – path where file needs to be decompressed

Raises

ValueError – If given archive file doesn’t exist.

Module contents