Skip to content
This repository was archived by the owner on May 9, 2024. It is now read-only.

feat: Filebrowser in storage-proxy #25

Closed
wants to merge 94 commits into from

Conversation

achimnol
Copy link
Member

@achimnol achimnol commented Jul 25, 2021

Moves file browsers to the storage proxy.
This will greatly reduce the burden of manager and webserver to mediate large file transfers.

  • Implement filebrowser container mgmt (with a configurable maximum number of containers)
  • Add new manager-facing storage-proxy APIs to create/destroy filebrowser sessions
  • manager: Add new vfolder APIs to map the new manager-facing storage-proxy APIs
  • client-py: Add new functional wrappers and vfolder subcommands to use the filebrowser session APIs
  • webui: Update the current filebrowser UIs to use the new filebrowser session mgmt APIs

To think:

  • Shall we allow explicit listing of filebrowser sessions in the client-side?
  • How to expire unused filebrowser sessions automatically?

@achimnol achimnol added this to the 21.03 milestone Jul 25, 2021
@achimnol achimnol added the feature New feature or request label Jul 25, 2021
@leksikov
Copy link
Contributor

leksikov commented Jan 25, 2022

  • Implemented main feature such as the communication between Client, Manager and Storage Proxy to support FileBrowser session where user can attach multiple folders and copy/move data using UI.
  • Implemented the CLI commands on client side to start filebrowser with attached folders.
  • Implemented Manager API interface to redirect requests from Client to Storage Proxy and receive the response from Storage Proxy about the url, port number and container id.
  • The user using CLI can destroy the filebrowser session using the container id information printed out in cli.

While implementing the filebrowser, I decoupled filebrowser from Vfolder as a separate app.
This will allow having filebrowser its own routing management for more complex commands.
Decoupling will help to improve maintenance and advanced feature implementation of Filebrowser without affecting Vfolder.

Next things to-do:

  • Need to think how to and where to store the Object docker container in order to auto-destroy after a certain time.
  • The FileBrowser has customizations options such as company title and logo for WebUI. Can be implemented by sending exec command into settings options
  • Supporting multiple docker container sessions per user at the same time. Maybe by having different randomized access port as easy and quick solution?

Copy link
Member Author

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some initial reviews. (Since I'm the creator of this PR, I cannot "approve" or "request changes", so I'm leaving the review as a plain comment.)
We need to think more about automatically terminating "idle" filebrowser containers.

Comment on lines 42 to 45
if "g" in str(memory):
memory = memory * 1e+9
elif "m":
memory = memory * 1000000
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use BinarySize of ai.backend.common.validators (usually imported as tx which means "extended trafarets") in .config.local_config_iv to avoid repetition like this calculation.

Comment on lines 25 to 29
def mangle_path(mount_path, vfid):
prefix1 = vfid[0:2]
prefix2 = vfid[2:4]
rest = vfid[4:]
return Path(mount_path, prefix1, prefix2, rest)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we refactor this out as a reusable function?

settings_path = ctx.local_config["filebrowser"]["settings_path"]
mount_path = ctx.local_config["filebrowser"]["mount_path"]
cpu_count = ctx.local_config["filebrowser"]["max-cpu"]
memory = ctx.local_config["filebrowser"]["max-mem"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please implement max-containers configuration.
This would require a local tracking of container states, like the agent, so that we could count the existing number of containers and get notified when the containers terminate by themselves.

Ideally, we could extend a ai.backend.common.docker module to provide Docker lifecycle events subscription and basic parsing into typed Python objects, and reuse it across the agent and storage-proxy.

@leksikov
Copy link
Contributor

leksikov commented Apr 11, 2022

With recent git commit, proposed simple solution to prevent execution of multiple monitors and network monitoring for 1 filebrowser container. The executed monitor checks if there is a file lock exists. If file not exist then create asynchronously file lock and continue monitoring task.

Other monitors who get called are checking file lock existence. Since file lock already exists because of the first monitor, all following monitors will be terminated and not proceed with their monitoring task. Hence, only one monitor will be running who create the file lock first.

The file lock is automatically deleted when the storage proxy is shut down.

@achimnol
Copy link
Member Author

With recent git commit, proposed simple solution to prevent execution of multiple monitors and network monitoring for 1 filebrowser container. The executed monitor checks if there is a file lock exists. If file not exist then create asynchronously file lock and continue monitoring task.

Other monitors who get called are checking file lock existence. Since file lock already exists because of the first monitor, all following monitors will be terminated and not proceed with their monitoring task. Hence, only one monitor will be running who create the file lock first.

The file lock is automatically deleted when the storage proxy is shut down.

Why not used ai.backend.common.lock.FileLock and/or ai.backend.common.distributed.GlobalTimer?
Simply opening files may have races!

@leksikov
Copy link
Contributor

leksikov commented Apr 12, 2022

Previously, did the implementation on the server of preventing multiple monitors run with common.FileLock.

Implemented UID and GID settings in a startup.sh script after the containers are set. The input variables are defined in the .toml settings file.

Refactoring, in order to deal with multiple volumes and their path in the settings file.

Added a host:volume option in client-py. This is needed for the following things: to specify the volume on which folders are located. The same name folder can occur on multiple volumes. Then based on given argument options such as volume and name the vfid of vfolder is infered. Then filebrowser based on volume path and vfid is able to mount those directories into the container.

On client-py command execution example:
backend.ai filebrowser create -host local:volume1 -vf mydata1 -vf mydata2

@inureyes
Copy link
Member

inureyes commented Nov 8, 2023

This PR is resolved at lablup/backend.ai#710

@inureyes inureyes closed this Nov 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature New feature or request storage proxy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants