-
Notifications
You must be signed in to change notification settings - Fork 119
MFC Containerization #971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
MFC Containerization #971
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
https://hub.docker.com/r/mohdsaid497566/mfc/tags?ordering=name |
Running shb-m1pro-3: spencer/Downloads $ docker run --rm -it --platform=linux/amd64 mohdsaid497566/mfc:v4.9.6-cpu
root@6175637cf718:/opt/MFC# ls
CITATION.cff Dockerfile README.md build examples mfc.sh src toolchain
CMakeLists.txt LICENSE benchmarks docs mfc.bat misc tests
root@6175637cf718:/opt/MFC# ./mfc.sh run ./examples/1D_sodshocktube/case.py -n 1 -j 20
mfc: OK > (venv) Entered the Python 3.10.12 virtual environment (>= 3.8).
.=++*: -+*+=. | root@6175637cf718 [Linux]
:+ -*- == =* . | -------------------------
:*+ == ++ .+- |
:*##-.....:*+ .#%+++=--+=:::. | --jobs 20
-=-++-======#=--**+++==+*++=::-:. | --mpi --no-gpu --no-debug --no-gcov --no-unified
.:++=----------====+*= ==..:%..... | --targets pre_process, simulation, and post_process
.:-=++++===--==+=-+= +. := |
+#=::::::::=%=. -+: =+ *: | ----------------------------------------------------------
.*=-=*=.. :=+*+: -...-- | $ ./mfc.sh (build, run, test, clean, count, packer) --help
Acquiring /opt/MFC/examples/1D_sodshocktube/case.py...
Acquiring /opt/MFC/examples/1D_sodshocktube/case.py...
Build | syscheck, syscheck, pre_process, simulation, and post_process | Generic Build
Generating case.fpp.
Writing a (new) custom case.fpp file.
$ cmake --build /opt/MFC/build/staging/7d9b728a37 --target syscheck --parallel 20 --config Release
[100%] Built target syscheck
$ cmake --install /opt/MFC/build/staging/7d9b728a37
-- Install configuration: "Release"
-- Up-to-date: /opt/MFC/build/install/7d9b728a37/bin/syscheck
Generating case.fpp.
Writing a (new) custom case.fpp file.
$ cmake --build /opt/MFC/build/staging/9a4af0a3bd --target pre_process --parallel 20 --config Release
[100%] Built target pre_process
$ cmake --install /opt/MFC/build/staging/9a4af0a3bd
-- Install configuration: "Release"
-- Up-to-date: /opt/MFC/build/install/9a4af0a3bd/bin/pre_process
Generating case.fpp.
Writing a (new) custom case.fpp file.
$ cmake --build /opt/MFC/build/staging/98998883b5 --target simulation --parallel 20 --config Release
[100%] Built target simulation
$ cmake --install /opt/MFC/build/staging/98998883b5
-- Install configuration: "Release"
-- Up-to-date: /opt/MFC/build/install/98998883b5/bin/simulation
Generating case.fpp.
Writing a (new) custom case.fpp file.
$ cmake --build /opt/MFC/build/staging/03b34a2688 --target post_process --parallel 20 --config Release
[100%] Built target post_process
$ cmake --install /opt/MFC/build/staging/03b34a2688
-- Install configuration: "Release"
-- Up-to-date: /opt/MFC/build/install/03b34a2688/bin/post_process
Run
Using queue system Interactive.
Using baked-in template for default.
Generating input files for syscheck...
Generating syscheck.inp:
INFO: Forwarded 0/49 parameters.
Generating input files for pre_process...
Generating pre_process.inp:
INFO: Forwarded 31/49 parameters.
Generating input files for simulation...
Generating simulation.inp:
INFO: Forwarded 33/49 parameters.
Generating input files for post_process...
Generating post_process.inp:
INFO: Forwarded 20/49 parameters.
$ /bin/bash /opt/MFC/examples/1D_sodshocktube/MFC.sh
+-----------------------------------------------------------------------------------------------------------+
| MFC case # MFC @ /opt/MFC/examples/1D_sodshocktube/case.py: |
+-----------------------------------------------------------------------------------------------------------+
| * Start-time 00:47:07 * Start-date 00:47:07 |
| * Partition N/A * Walltime 01:00:00 |
| * Account N/A * Nodes 1 |
| * Job Name MFC * Engine interactive |
| * QoS N/A * Binary N/A |
| * Queue System Interactive * Email N/A |
+-----------------------------------------------------------------------------------------------------------+
mfc: WARNING > This is the default template.
mfc: WARNING > It is not intended to support all systems and execution engines.
mfc: WARNING > Consider using a different template via the --computer option if you encounter problems.
mfc: OK > :) Selected MPI launcher mpirun. Use --binary to override.
mfc: OK > :) Running syscheck:
+ mpirun -np 1 /opt/MFC/build/install/7d9b728a37/bin/syscheck
[TEST] MPI: call mpi_init(ierr)
[TEST] MPI: call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr)
[TEST] MPI: call mpi_barrier(MPI_COMM_WORLD, ierr)
[TEST] MPI: call assert(rank >= 0)
[TEST] MPI: call mpi_comm_size(MPI_COMM_WORLD, nRanks, ierr)
[TEST] MPI: call assert(nRanks > 0 .and. rank < nRanks)
[SKIP] ACC: devtype = acc_get_device_type()
[SKIP] ACC: num_devices = acc_get_num_devices(devtype)
[SKIP] ACC: call assert(num_devices > 0)
[SKIP] ACC: call acc_set_device_num(mod(rank, nRanks), devtype)
[SKIP] ACC: allocate(arr(1:N))
[SKIP] ACC: !$acc enter data create(arr(1:N))
[SKIP] ACC: !$acc parallel loop
[SKIP] ACC: !$acc update host(arr(1:N))
[SKIP] ACC: !$acc exit data delete(arr)
[TEST] MPI: call mpi_barrier(MPI_COMM_WORLD, ierr)
[TEST] MPI: call mpi_finalize(ierr)
Syscheck: PASSED.
mfc: OK > :) Running pre_process:
+ mpirun -np 1 /opt/MFC/build/install/9a4af0a3bd/bin/pre_process
Program received signal SIGILL: Illegal instruction.
Backtrace for this error:
#0 0x7fffff49e960 in ???
#1 0x7fffff49dac5 in ???
#2 0x7fffff09051f in ???
#3 0x55555555cfd3 in MAIN__
#4 0x5555555575fe in main
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node 6175637cf718 exited on signal 4 (Illegal instruction).
--------------------------------------------------------------------------
mfc: ERROR > :( /opt/MFC/build/install/9a4af0a3bd/bin/pre_process failed with exit code 132. |
Also I no longer see codespaces available in the MFC repo |
for setting up a codespace in the MFC repo (under mflowcode) i have this in my settings: Codespaces |
as someone unfamiliar with docker but familiar with many other "coding thing" I was unable to get this running after trying many different things |
This sounds like the right path, I just can't run the containers on my machine right now (so new users definitely won't use it in current state. if macos + arm is the issue, well many people own Macbooks...) |
Yeah I just tried that specific image and I do not seem to have any issues running anything on wsl or codespaces. I will see if I can maybe downgrade ubuntu or get a functional base container for macos to make it run somehow.
If you want try it on Phoenix till I figure something out - or anything non-macos. |
https://docs.docker.com/build/building/multi-platform/ Found this doc on multi-platform builds that Docker automatically selects the correct variant based on the host's architecture |
Well you could deploy an ARM compatible container. That's one option. But I've run docker with x86 code in it without issue on my Mac before (just had to add a flag so it knew what to do). I tried this with your container but it didn't work. Also I'm still confused about codespaces. Did you just enable it on your fork or something?
Yeah I "tried" that but Phoenix doesn't have docker or singularity/apptainer? I must be missing something here. |
So presumably this builds both versions, I suppose? We could try it out. I feel like this might be the equivalent of me using |
No, on the base and fork, I do not seem to be having any issues starting a codespace at all.
The cli's do not exist in the login node. If you start an interactive job, you should be able to use apptainer and singularity directly there.
Whether multi-platform or arch-specific containers, I will experiment and compare whichever is better. |
Can you post a screenshot(s)? I don't see it anymore (I used to see it... which is why i suggested it)
Sounds good to me. NVHPC should support but you may have to be careful what image of theirs you pull down. |
Oh interesting. I don't see them when doing this on Phoenix nor Bridges2. What computer showed you this, did you have to load other modules first? |
Without loading any module, I just tried phoenix, bridges2, and delta with on-demand and ssh, and I was able to use them. |
Indeed... apptainer works from compute nodes. |
this is still draft (not ready for review). it needs a user guide and a way for users to deploy in both user cases: 1. the programmer on a computer w/o internet and 2. the newcomer that knows very very little. In each case I think we have some improvements to make to clarify what to do and how to do it. the newcomer may benefit from codespaces but i just tried it myself and couldn't figure out what to do from the vscode codespace. that said, i was able to pull the docker container onto my mac when looking at your dockerhub for arm64 and that did indeed work |
BTW look at that old MFC logo! very cool |
PS let me know when this is ready for me to look at again |
Status Update: Pre-configured Repo Codespaces https://github.com/Malmahrouqi3/MFC-mo2/ Docker GUI can be used to run any containerized release. It would prior stop instantly after startup unless there was an active running command pre-written. However, currently, it starts and stops when instructed by user via GUI buttons. It is pretty cool right now that you can retrieve any release and play with it within seconds under Exec tab. |
very well done @Malmahrouqi3! This is really a blast from the past... I can't even remember the ordering of commands to get it to run with tip: we'll want to keep this requiring minimal upkeep, so it just keeps rolling out containers but otherwise doesn't need to be updated. right now it looks like you have a separate readme for the dockerhub, which closely mirrors the current MFC readme with some changes. this isn't really maintainable... MFC readmes change rapidly as I occasionally kill time messing with it. A minimal but everlasting readme is better (even if it's less attractive to look at).
|
I just checked the failing case that you were running which was I noticed that in your
and a line that says
You are failing the test because you aren't copying over important data files required to run. You commended out the specific input files, but you left in the blanket regex that matches all |
Thanks for catching this, and yeah that prolly being mistreated somehow. The Edit 1: As I said the exceptions were actually mishandled by the docker builder. The new images would pass PS: The Note to Self: Quick sanity check and update the PR description to be archived properly with all incurred/potential issues and how to handle them. Note to Spencer: Please remove the old GitHub Secrets from (#935) and add the new ones mentioned in |
Removed specific exceptions for lag_bubbles.dat files from .dockerignore.
Will do. I'm busy right now, but I'll update you soon. Please conduct a "dummy" check and have someone attempt this without any additional instructions. |
Great deal, yeah sure. |
devcontainer not working w/ mpi for me (2 ranks). 1 rank works fine. | * QoS N/A * Binary N/A |
| * Queue System Interactive * Email N/A |
+-----------------------------------------------------------------------------------------------------------+
mfc: WARNING > This is the default template.
mfc: WARNING > It is not intended to support all systems and execution engines.
mfc: WARNING > Consider using a different template via the --computer option if you encounter problems.
mfc: OK > :) Selected MPI launcher mpirun. Use --binary to override.
mfc: OK > :) Running syscheck:
+ mpirun -np 2 /opt/MFC/build/install/9bc2b5c83c/bin/syscheck
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:
/opt/MFC/build/install/9bc2b5c83c/bin/syscheck
Either request fewer slots for your application, or make more slots
available for use.
A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores
In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------
mfc: ERROR > :( /opt/MFC/build/install/9bc2b5c83c/bin/syscheck failed with exit code 1.
Error: Submitting batch file for Interactive failed. It can be found here: /opt/MFC/examples/1D_sodshocktube/MFC.sh. Please check the file for errors.
Terminated
mfc: ERROR > main.py finished with a 143 exit code.
mfc: (venv) Exiting the Python virtual environment.
root@codespaces-a72cc1:/opt/MFC# |
I made a google colab with Getting-Started and Docker pages' content. # Install Linux and run all commands there.
# Note: use this cell to paste your code.
# Use docker to run a 2D_shockbubble case on v4.9.9 CPU release.
First press the windows key, search for the Docker app. Open the Docker app and go to the search bar. In the search bar, type the following search as follows: " sbryngelson/mfc" Upon the results that are shown,
hover over the first option shown, then pan over to the the tag menu. Scroll down until you have found the requested version, "v4.9.9cpu". Confirm your choice by clicking on "v4.9.9cpu".
Click run to execute what you have searched. Inspect the file as it's running by checking on the terminal, located in the "Execution" tab. You can use this to run any command in the terminal as the program is running.
Next, in the Terminal, where the Program is running, paste the following command: "./mfc.sh run examples/2D_shockbubble/case.py -n 2".
# Use docker to run all tests on on v5.0.5 GPU release.
Copy the following command: "docker run -it --rm --gpus all --entrypoint bash sbryngelson/mfc:latest-gpu". Replace the "latest" tab with the requested GPU version, which in this instance, would be "v5.0.5".
Ensure the V is lowercase, as the command is case sensitive. Proceed to the Terminal in the Docker app, located in the lower right hand corner of the screen.
Paste the command, ensuring to replace "latest" with the specified GPU version, "v5.0.5". Excecute (run) the command. Copy the following command: "./mfc.sh test -j 8" into the terminal of Docker, and run it. |
User description
Description
Replacing (#935), since it was overly inefficient workflow without yielding any tangible advantages.
Notes to Self:
Containerization of MFC Releases (v4.3.0-v5.0.5)
https://github.com/Malmahrouqi3/MFC-mo2.git --branch docker
, and add credentials to its GitHub secrets.release.yml
), then hit Run workflow and specify a range of desired releases. Except the first few published releases where--gpu
flag was not defined yet, the rest should build without any issues. Repeated use of Release Dispatcher for debugging could exhaust docker limits.e.g. "You have reached your pull rate limit as"
Setup MFC Container on Codespaces
This is intended to pre-configure repo codespace instances to load MFC docker container for users to interact with instantaneously.
.devcontainer
folder in the repo root, and add adevcontainer.json
file.Quick Start Guide
GitHub Codespaces
Locally
User Guide
Make sure to mount a directory to
mnt
inside the container to easily share files between the host and the container, e.g.cp -r <source> /mnt/
.Docker CLI
Official Documentation
Use it for running and testing containers on your local machine.
Example Usage:
For Portability,
On the source machine,
On the target machine,
Apptainer CLI
Official Documentation
Use it for running containers on powerful machines and HPC clusters on either interactive terminals or batch jobs.
Example Usage:
or
--fakeroot --writable-tmpfs
is critical to:For Portability,
On the source machine,
.sif
formatOn the target machine,
PR Type
Enhancement
Description
Add Docker containerization support for MFC
Create CPU and GPU container variants
Implement automated container builds on releases
Configure Docker Hub deployment workflow
Diagram Walkthrough
File Walkthrough
.dockerignore
Docker ignore configuration for clean builds
.github/.dockerignore
Dockerfile
Multi-architecture Dockerfile for MFC containers
.github/Dockerfile
docker.yml
Automated container build and deployment workflow
.github/workflows/docker.yml
PR Type
Enhancement
Description
Add Docker containerization support for MFC releases
Create GitHub Actions workflow for automated container builds
Configure development container for Codespaces integration
Support both CPU and GPU container variants
Diagram Walkthrough
File Walkthrough
devcontainer.json
Add Codespaces development container configuration
.devcontainer/devcontainer.json
.dockerignore
Add Docker ignore file for build optimization
.github/.dockerignore
Dockerfile
Add Dockerfile for MFC containerization
.github/Dockerfile
docker.yml
Add GitHub Actions workflow for container builds
.github/workflows/docker.yml
Note
Adds Dockerfile and CI workflow to build/push CPU/GPU images and a Codespaces devcontainer configuration.
/.github/workflows/docker.yml
: Matrix builds forcpu
/gpu
on x86/arm; sets up Buildx/QEMU; clones release tag; builds with compiler/base-image args; pushes to Docker Hub; creates/pushes multi-platform manifest lists (cpu
,gpu
,latest-*
)./.github/Dockerfile
: Parametric base image and compilers; conditional deps forcpu
vsgpu
; builds MFC and runs dry-run tests; setsOMPI_*
,PATH
,LD_LIBRARY_PATH
; stages repo in/opt/MFC
and sets default workdir/entrypoint./.github/.dockerignore
: Excludes build artifacts, caches, example outputs, and large media from build context./.devcontainer/devcontainer.json
: Codespaces devcontainer using publishedsbryngelson/mfc:*
image with workspace and editor settings.Written by Cursor Bugbot for commit bea47ca. This will update automatically on new commits. Configure here.