Buckets API Reference¶
Buckets¶
- class apolo_sdk.Buckets¶
Blob storage buckets subsystems, available as
Client.buckets
.The subsystem helps take advantage of many basic functionality of Blob Storage solutions different cloud providers support. For AWS it would be S3, for GCP - Cloud Storage, etc.
- async list(cluster_name: str | None = None) AsyncContextManager[AsyncIterator[Bucket]] [source]¶
List user’s buckets, async iterator. Yields
Bucket
instances.- Parameters:
cluster_name (str) – cluster to list buckets. Default is current cluster.
- async create(name: Optional[str], cluster_name: str | None = None, org_name: str | None = None) Bucket [source]¶
Create a new bucket.
- async import_external(provider: Bucket.Provider, provider_bucket_name: str, credentials: Mapping[str, str], name: str | None = None, cluster_name: str | None = None, org_name: str | None = None) Bucket [source]¶
Import a new bucket.
- Parameters:
provider (Bucket.Provider) – Provider type of imported bucket.
provider_bucket_name (str) – Name of external bucket inside the provider.
credentials (Mapping[str, str]) – Raw credentials to access bucket provider.
name (Optional[str]) – Name of the bucket. Should be unique among all user’s bucket.
cluster_name (str) – cluster to import a bucket. Default is current cluster.
org_name (str) – org to import a bucket. Default is current org.
- Returns:
Newly imported bucket info (
Bucket
)
- async get(bucket_id_or_name: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) Bucket [source]¶
Get a bucket with id or name bucket_id_or_name.
- async rm(bucket_id_or_name: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) None [source]¶
Delete a bucket with id or name bucket_id_or_name.
- async request_tmp_credentials(bucket_id_or_name: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) BucketCredentials [source]¶
Get a temporary provider credentials to bucket with id or name bucket_id_or_name.
- Parameters:
- Returns:
Bucket credentials info (
BucketCredentials
)
- async set_public_access(bucket_id_or_name: str, public_access: bool, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) Bucket [source]¶
Enable or disable public (anonymous) read access to bucket.
- Parameters:
- Returns:
Bucket info (
Bucket
)
- async head_blob(bucket_id_or_name: str, key: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) BucketEntry [source]¶
Look up the blob and return it’s metadata.
- Parameters:
- Returns:
BucketEntry
object.- Raises:
ResourceNotFound
if key does not exist.
- async put_blob(bucket_id_or_name: str, key: str, body: Union[AsyncIterator[bytes], bytes], cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None, ) None [source]¶
Create or replace blob identified by
key
in the bucket, e.g:large_file = Path("large_file.dat") size = large_file.stat().st_size file_md5 = await calc_md5(large_file) async def body_stream(): with large_file.open("r") as f: for line in f: yield f await client.buckets.put_blob( bucket_id_or_name="my_bucket", key="large_file.dat", body=body_stream, )
- Parameters:
bucket_id_or_name (str) – bucket’s id or name.
key (str) – Key of the blob.
body (bytes) – Body of the blob. Can be passed as either
bytes
or as anAsyncIterator[bytes]
.cluster_name (str) – cluster to look for a bucket. Default is current cluster.
bucket_owner (str) – bucket owner’s username. Used only if looking up for bucket by it’s name. Default is current user.
- async fetch_blob(bucket_id_or_name: str, key: str, offset: int = 0, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) AsyncIterator[bytes] [source]¶
Look up the blob and return it’s body content only. The content will be streamed using an asynchronous iterator, e.g.:
async with client.buckets.fetch_blob("my_bucket", key="file.txt") as content: async for data in content: print("Next chunk of data:", data)
- Parameters:
bucket_id_or_name (str) – bucket’s id or name.
key (str) – Key of the blob.
offset (int) – Position in blob from which to read.
cluster_name (str) – cluster to look for a bucket. Default is current cluster.
bucket_owner (str) – bucket owner’s username. Used only if looking up for bucket by it’s name. Default is current user.
- async delete_blob(bucket_id_or_name: str, key: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) None [source]¶
Remove blob from the bucket.
- async list_blobs(uri: URL, recursive: bool = False, limit: int = 10000) AsyncContextManager[AsyncIterator[BucketEntry]] [source]¶
List blobs in the bucket. You can filter by prefix and return results similar to a folder structure if
recursive=False
is provided.- Parameters:
uri (URL) – URL that specifies bucket and prefix to list blobs, e.g.
yarl.URL("blob:bucket_name/path/in/bucket")
.bool (recursive) – If
True
listing will contain all keys filtered by prefix, while withFalse
only ones up to next/
will be returned. To indicate missing keys, all that were listed will be combined under a common prefix and returned asBlobCommonPrefix
.int (limit) – Maximum number of
BucketEntry
objects returned.
- async glob_blobs(uri: URL) AsyncContextManager[AsyncIterator[BucketEntry]] [source]¶
Glob search for blobs in the bucket:
async with client.buckets.glob_blobs( uri=URL("blob:my_bucket/folder1/**/*.txt") ) as blobs: async for blob in blobs: print(blob.key)
Similar to
Storage.glob()
the“**”
pattern means “this directory and all sub-directories, recursively”.- Parameters:
uri (URL) – URL that specifies bucket and pattern to glob blobs, e.g.
yarl.URL("blob:bucket_name/path/**/*.bin")
.
- async upload_file(src: URL, dst: URL, *, update: bool = False, progress: AbstractFileProgress | None = None) None: [source]¶
Similarly to
Storage.upload_file()
, allows to upload local file src to bucket URL dst.- Parameters:
src (URL) – path to uploaded file on local disk, e.g.
yarl.URL("file:///home/andrew/folder/file.txt")
.dst (URL) – URL that specifies bucket and key to upload file e.g.
yarl.URL("blob:bucket_name/folder/file.txt")
.update (bool) – if true, upload only when the source file is newer than the destination file or when the destination file is missing.
progress (AbstractFileProgress) – a callback interface for reporting uploading progress,
None
for no progress report (default).
- async download_file(src: URL, dst: URL, *, update: bool = False, continue_: bool = False, progress: AbstractFileProgress | None = None) None: [source]¶
Similarly to
Storage.download_file()
, allows to download remote file src to local path dst.- Parameters:
src (URL) – URL that specifies bucket and blob key to download e.g.
yarl.URL("blob:bucket_name/folder/file.bin")
.dst (URL) – local path to save downloaded file, e.g.
yarl.URL("file:///home/andrew/folder/file.bin")
.update (bool) – if true, download only when the source file is newer than the destination file or when the destination file is missing.
continue (bool) – if true, download only the part of the source file past the end of the destination file and append it to the destination file if the destination file is newer and not longer than the source file. Otherwise download and overwrite the whole file.
progress (AbstractFileProgress) – a callback interface for reporting downloading progress,
None
for no progress report (default).
- async upload_dir(src: URL, dst: URL, *, update: bool = False, filter: Callable[[str], Awaitable[bool]] | None = None, ignore_file_names: AbstractSet[str] = frozenset(), progress: AbstractRecursiveFileProgress | None = None) None: [source]¶
Similarly to
Storage.upload_dir()
, allows to recursively upload local directory src to Blob Storage URL dst.- Parameters:
src (URL) – path to uploaded directory on local disk, e.g.
yarl.URL("file:///home/andrew/folder")
.dst (URL) – path on Blob Storage for saving uploading directory e.g.
yarl.URL("blob:bucket_name/folder/")
.update (bool) – if true, download only when the source file is newer than the destination file or when the destination file is missing.
filter (Callable[[str], Awaitable[bool]]) – a callback function for determining which files and subdirectories be uploaded. It is called with a relative path of file or directory and if the result is false the file or directory will be skipped.
ignore_file_names (AbstractSet[str]) – a set of names of files which specify filters for skipping files and subdirectories. The format of ignore files is the same as
.gitignore
.progress (AbstractRecursiveFileProgress) – a callback interface for reporting uploading progress,
None
for no progress report (default).
- async download_dir(src: URL, dst: URL, *, update: bool = False, continue_: bool = False, filter: Callable[[str], Awaitable[bool]] | None = None, progress: AbstractRecursiveFileProgress | None = None) None: [source]¶
Similarly to
Storage.download_dir()
, allows to recursively download remote directory src to local path dst.- Parameters:
src (URL) – path on Blob Storage to download a directory from e.g.
yarl.URL("blob:bucket_name/folder/")
.dst (URL) – local path to save downloaded directory, e.g.
yarl.URL("file:///home/andrew/folder")
.update (bool) – if true, download only when the source file is newer than the destination file or when the destination file is missing.
continue (bool) – if true, download only the part of the source file past the end of the destination file and append it to the destination file if the destination file is newer and not longer than the source file. Otherwise download and overwrite the whole file.
filter (Callable[[str], Awaitable[bool]]) – a callback function for determining which files and subdirectories be downloaded. It is called with a relative path of file or directory and if the result is false the file or directory will be skipped.
progress (AbstractRecursiveFileProgress) – a callback interface for reporting downloading progress,
None
for no progress report (default).
- async blob_is_dir(uri: URL) bool [source]¶
Check weather uri specifies a “folder” blob in a bucket.
- Parameters:
src (URL) – URL that specifies bucket and blob key e.g.
yarl.URL("blob:bucket_name/folder/sub_folder")
.
- async blob_rm(uri: URL, *, recursive: bool = False, progress: AbstractDeleteProgress | None = None) None [source]¶
Remove blobs from bucket.
- Parameters:
uri (URL) – URL that specifies bucket and blob key e.g.
yarl.URL("blob:bucket_name/folder/sub_folder")
.recursive (bool) – remove a directory recursively with all nested files and folders if
True
(False
by default).progress (AbstractDeleteProgress) – a callback interface for reporting delete progress,
None
for no progress report (default).
- Raises:
IsADirectoryError
if uri points on a directory and recursive flag is not set.
- async make_signed_url(uri: URL, expires_in_seconds: int = 3600) URL [source]¶
Generate a singed url that allows temporary access to blob.
- async get_disk_usage(bucket_id_or_name: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) AsyncContextManager[AsyncIterator[BucketUsage]] [source]¶
Get disk space usage of a given bucket. Iterator yield partial results as calculation for the whole bucket can take time.
- async persistent_credentials_list(cluster_name: str | None = None) AsyncContextManager[AsyncIterator[PersistentBucketCredentials]] [source]¶
List user’s bucket persistent credentials, async iterator. Yields
PersistentBucketCredentials
instances.- Parameters:
cluster_name (str) – cluster to list persistent credentials. Default is current cluster.
- async persistent_credentials_create(bucket_ids: Iterable[str], name: Optional[str], read_only: bool | None = False, cluster_name: str | None = None) PersistentBucketCredentials [source]¶
Create a new persistent credentials for given set of buckets.
- Parameters:
bucket_ids (Iterable[str]) – Iterable of bucket ids to create credentials for.
name (Optional[str]) – Name of the persistent credentials. Should be unique among all user’s bucket persistent credentials.
read_only (str) – Allow only read-only access using created credentials.
False
by default.cluster_name (str) – cluster to create a persistent credentials. Default is current cluster.
- Returns:
Newly created credentials info (
PersistentBucketCredentials
)
- async persistent_credentials_get(credential_id_or_name: str, cluster_name: str | None = None) PersistentBucketCredentials [source]¶
Get a persistent credentials with id or name credential_id_or_name.
- Parameters:
- Returns:
Credentials info (
PersistentBucketCredentials
)
Bucket¶
- class apolo_sdk.Bucket¶
Read-only
dataclass
for describing single bucket.- provider¶
Blob storage provider this bucket belongs to,
Bucket.Provider
.
BucketCredentials¶
Bucket.Provider¶
PersistentBucketCredentials¶
- class apolo_sdk.PersistentBucketCredentials¶
Read-only
dataclass
for describing persistent credentials to some set of buckets created after user request.- name¶
The credentials name set by user, unique among all user’s bucket credentials,
str
orNone
if no name was set.
- credentials¶
List of per bucket credentials,
List[BucketCredentials]
BucketEntry¶
- class apolo_sdk.BucketEntry¶
An abstract class
dataclass
for describing bucket contents entries.- created_at¶
Blob creation timestamp,
datetime
orNone
if underlying blob engine do not store such information
BlobObject¶
- class apolo_sdk.BlobObject¶
An ancestor of
BucketEntry
used for key that are present directly in underlying blob storage.
BlobCommonPrefix¶
- class apolo_sdk.BlobCommonPrefix¶
An ancestor of
BucketEntry
for describing common prefixes for blobs in non-recursive listing. You can treat it as a kind of folder on Blob Storage.