First try running MITgcm

I first tried running the well-known numerical ocean model MITgcm about 12 years ago when I was thinking of trying to run some numerical simulations of shoaling nonlinear internal waves for my thesis work. As it turned out, I had more observational data than even I was going to be able to look at, so I shelved the modeling part of the project and moved on to pure data analysis.

Fast forward a decade or so, and I’m interested again. What follows is a crude log of my attempts over the course of an afternoon to get MITgcm up and running on my (quite new) M1 Macbook Pro, including various pieces such as getting parallel processing to work (mpi) and output to netCDF files.

Setup notes: Installing MITgcm stuff on M1 mac

I am starting out by following instructions here:

https://jklymak.github.io/MITgcmExampleSteadyGauss/install.html

Already had Xcode and homebrew installed. Then:

$ brew install gcc
$ brew install open-mpi

Step 6 at the link doesn’t work, but it seems like the newer version should be:

$ brew install hdf5-mpi

Then:

$ brew install netcdf

which gave:

==> Installing netcdf dependency: hdf5
Error: Cannot install hdf5 because conflicting formulae are installed.
  hdf5-mpi: because hdf5-mpi is a variant of hdf5, one can only use one or the other

Please `brew unlink hdf5-mpi` before continuing.

Unlinking removes a formula's symlinks from /opt/homebrew. You can
link the formula again after the install finishes. You can --force this
install, but the build may fail or cause obscure side effects in the
resulting software.

so I did:

$ brew unlink hdf5-mpi

(note based on Jody’s comments that MITgcm doesn’t have to have hdf5-mpi, but it is required for a parallel-aware netcdf interface he is developing. Don’t know the current status of that). Now that there is a separate hdf5 package with mpi built in I don’t know what the consequences would be for the two “build from source” steps in Jody’s example.

Then I did:

$ brew reinstall --build-from-source netcdf

It’s not clear to me if the “build-from-source” is actually necessary here — it might be part of the steps Jody added for his netcdf interface work. A simple brew install netcdf might actually be fine …

Trying it an example:

Clone the repo and go into the exp2 example (following https://mitgcm.readthedocs.io/en/latest/getting_started/getting_started.html):

$ git clone https://github.com/MITgcm/MITgcm.git
$ cd verification/exp2/build

Generate the Makefile:

$ ../../../tools/genmake2 -mods ../code/ -optfile ../../../tools/build_options/darwin_amd64_gfortran
GENMAKE :

A program for GENerating MAKEfiles for the MITgcm project.
   For a quick list of options, use "genmake2 -h"
or for more detail see the documentation, section "Building the model"
   (under "Getting Started") at:  https://mitgcm.readthedocs.io/

===  Processing options files and arguments  ===
  getting local config information:  none found
Warning: ROOTDIR was not specified ; try using a local copy of MITgcm found at "../../.."
  getting OPTFILE information:
    using OPTFILE="../../../tools/build_options/darwin_amd64_gfortran"
    get Compiler-version: '11'
  getting AD_OPTFILE information:
    using AD_OPTFILE="../../../tools/adjoint_options/adjoint_default"
  check Fortran Compiler...  pass  (set FC_CHECK=5/5)
  check makedepend (local: 0, system: 1, 1)

===  Checking system libraries  ===
  Do we have the system() command using gfortran...  yes
  Do we have the fdate() command using gfortran...  yes
  Do we have the etime() command using gfortran... c,r: yes (SbR)
  Can we call simple C routines (here, "cloc()") using gfortran...  yes
  Can we unlimit the stack size using gfortran...  yes
  Can we register a signal handler using gfortran...  no
  Can we use stat() through C calls...  yes
  Can we create NetCDF-enabled binaries...  yes
    skip check for LAPACK Libs
  Can we call FLUSH intrinsic subroutine...  yes

===  Setting defaults  ===
  Adding MODS directories: ../code/
  Making source files in eesupp from templates
  Making source files in pkg/exch2 from templates
  Making source files in pkg/regrid from templates

===  Determining package settings  ===
  getting package dependency info from  ../../../pkg/pkg_depend
  getting package groups info from      ../../../pkg/pkg_groups
  checking list of packages to compile:
    using PKG_LIST="../code//packages.conf"
    before group expansion packages are: gfd cd_code
    replacing "gfd" with:  mom_common mom_fluxform mom_vecinv generic_advdiff debug mdsio rw monitor
    after group expansion packages are:  mom_common mom_fluxform mom_vecinv generic_advdiff debug mdsio rw monitor cd_code
  applying DISABLE settings
  applying ENABLE settings
    packages are:  cd_code debug generic_advdiff mdsio mom_common mom_fluxform mom_vecinv monitor rw
  applying package dependency rules
    packages are:  cd_code debug generic_advdiff mdsio mom_common mom_fluxform mom_vecinv monitor rw
  Adding STANDARDDIRS='eesupp model'
  Searching for *OPTIONS.h files in order to warn about the presence
    of "#define "-type statements that are no longer allowed:
    found CPP_EEOPTIONS="../../../eesupp/inc/CPP_EEOPTIONS.h"
    found CPP_OPTIONS="../../../model/inc/CPP_OPTIONS.h"
  Creating the list of files for the adjoint compiler.

===  Creating the Makefile  ===
  setting INCLUDES
  Determining the list of source and include files
  Writing makefile: Makefile
  Add the source list for AD code generation
  Making list of "exceptions" that need ".p" files
  Making list of NOOPTFILES
  Add rules for links
  Adding makedepend marker

===  Done  ===
  original 'Makefile' generated successfully
=> next steps:
  > make depend
  > make       (<-- to generate executable)

Then we run:

$ make depend

which for me generates lots of warnings, e.g.

In file included from ini_model_io.F:21:
./EESUPPORT.h:11:20: warning: empty character constant [-Winvalid-pp-token]
C     | environment'' code. This data should be private to the   |
                   ^
1 warning generated.

but does produce:

Appending dependencies to Makefile
../../../tools/f90mkdepend >> Makefile
rm -f makedepend.out

Looks like it worked! Now to test.

$ cd ../run
$ ln -s ../input/* .
$ cp ../build/mitgcmuv .
$ ./mitgcmuv

Works!

(PID.TID 0000.0001) //      Avg. barrier spins =       1.00E+00
PROGRAM MAIN: Execution ended Normally
STOP NORMAL END

However, I didn’t get netCDF files … must be a setting somewhere to change that I missed. Will look later.

Let’s try a 2 core run. Need to recompile with mpi — I think because there is a SIZE.h_mpi in code/ I don’t need to specify anything else, other than doing:

$ ../../../tools/genmake2 -mods ../code -mpi -optfile ../../../tools/build_options/darwin_amd64_gfortran

which produces:

GENMAKE :

A program for GENerating MAKEfiles for the MITgcm project.
   For a quick list of options, use "genmake2 -h"
or for more detail see the documentation, section "Building the model"
   (under "Getting Started") at:  https://mitgcm.readthedocs.io/

===  Processing options files and arguments  ===
  getting local config information:  none found
Warning: ROOTDIR was not specified ; try using a local copy of MITgcm found at "../../.."
  getting OPTFILE information:
    using OPTFILE="../../../tools/build_options/darwin_amd64_gfortran"
    get Compiler-version: '11'
  getting AD_OPTFILE information:
    using AD_OPTFILE="../../../tools/adjoint_options/adjoint_default"
  check Fortran Compiler...  pass  (set FC_CHECK=5/5)
  check makedepend (local: 0, system: 1, 1)
  Turning on MPI cpp macros

===  Checking system libraries  ===
  Do we have the system() command using mpif77...  yes
  Do we have the fdate() command using mpif77...  yes
  Do we have the etime() command using mpif77... c,r: yes (SbR)
  Can we call simple C routines (here, "cloc()") using mpif77...  yes
  Can we unlimit the stack size using mpif77...  yes
  Can we register a signal handler using mpif77...  no
  Can we use stat() through C calls...  yes
  Can we create NetCDF-enabled binaries...  yes
    skip check for LAPACK Libs
  Can we call FLUSH intrinsic subroutine...  yes

===  Setting defaults  ===
  Adding MODS directories: ../code
  Making source files in eesupp from templates
  Making source files in pkg/exch2 from templates
  Making source files in pkg/regrid from templates

===  Determining package settings  ===
  getting package dependency info from  ../../../pkg/pkg_depend
  getting package groups info from      ../../../pkg/pkg_groups
  checking list of packages to compile:
    using PKG_LIST="../code/packages.conf"
    before group expansion packages are: gfd cd_code
    replacing "gfd" with:  mom_common mom_fluxform mom_vecinv generic_advdiff debug mdsio rw monitor
    after group expansion packages are:  mom_common mom_fluxform mom_vecinv generic_advdiff debug mdsio rw monitor cd_code
  applying DISABLE settings
  applying ENABLE settings
    packages are:  cd_code debug generic_advdiff mdsio mom_common mom_fluxform mom_vecinv monitor rw
  applying package dependency rules
    packages are:  cd_code debug generic_advdiff mdsio mom_common mom_fluxform mom_vecinv monitor rw
  Adding STANDARDDIRS='eesupp model'
  Searching for *OPTIONS.h files in order to warn about the presence
    of "#define "-type statements that are no longer allowed:
    found CPP_OPTIONS="./CPP_OPTIONS.h"
    found CPP_EEOPTIONS="./CPP_EEOPTIONS.h"
  Creating the list of files for the adjoint compiler.

===  Creating the Makefile  ===
  setting INCLUDES
  Determining the list of source and include files
  Writing makefile: Makefile
  Add the source list for AD code generation
  Making list of "exceptions" that need ".p" files
  Making list of NOOPTFILES
  Add rules for links
  Adding makedepend marker

===  Done  ===

I don’t know how much I need to do the stuff about “modules” and $MPI_INC_DIR specified here:

https://mitgcm.readthedocs.io/en/latest/getting_started/getting_started.html#building-with-mpi

Let’s just see if it works with what I did above:

$ make clean
$ make depend

worked.

$ make 

Also seems to have worked.

Now we recopy the executable to the run/ directory, and run with mpi:

cd ../run
cp ../build/mitgcmuv .
mpirun -np 2 ./mitgcmuv

Which … didn’t work:

This suggests it’s because I didn’t configure things properly:

$ less STDERR.0000
(PID.TID 0000.0001) *** ERROR *** EEBOOT_MINIMAL: No. of procs=     2 not equal to nPx*nPy=     1
(PID.TID 0000.0001) *** ERROR *** EEDIE: earlier error in multi-proc/thread setting
(PID.TID 0000.0001) *** ERROR *** PROGRAM MAIN: ends with fatal Error

So let’s try again. Probably I just need to copy the code/SIZE.h_mpi over to SIZE.h. Then redo the genmake2, make depend, and make steps (after completely nuking everything in the build/ directory).

Then:

$ mpirun -np 2 ./mitgcmuv
STOP NORMAL END
STOP NORMAL END

Yay!

Now to figure out how to do netCDF and then look at data …

NetCDF output

Ok, looks like I need to add mnc to the packages.conf file in code/. Then nuke the build directory and rebuild as above.

Ok, that didn’t work. I still get only *data/meta files …

According to here:

https://mitgcm.readthedocs.io/en/latest/outp_pkgs/outp_pkgs.html#using-pkg-mnc

I need to run a genmake2 that looks like:

$ ../../../tools/genmake2 -mods ../code -mpi -enable=mnc -optfile ../../../tools/build_options/darwin_amd64_gfortran

Then the classic make depend, make, and move mitgcmuv to the run/ directory, and …. I still don’t get any netCDF files

Maybe it’s because I need to set it in the input/data.pkg file? I modified to look like:

# Packages
 &PACKAGES
  useMNC=.TRUE.,
 &

And will build/run again. Actually, shouldn’t need to build again, because this is a runtime option. I already built with MNC enabled.

This now fails with:

STOP ABNORMAL END: S/R OPEN_COPY_DATA_FILE
STOP ABNORMAL END: S/R OPEN_COPY_DATA_FILE

Let’s just switch back to single processor, and build without mpi to see if netCDF will work.

This doesn’t work either, but at least I can see in the ./mitgcmuv that it’s because it’s not finding a data.mnc file:

(PID.TID 0000.0001)  MNC_READPARMS: opening file 'data.mnc'
File data.mnc does not exist!

I’m just going to try copying the one from Jody’s “SteadyGauss” example, which contains:

# =====================================================================
# | Parameters for MNC (NetCDF)      |
# =====================================================================
# Example "data.mnc" file
# Lines beginning "#" are comments
 &MNC_01
/
# Note: Some systems use & as the
# namelist terminator. Other systems
# use a / character (as shown here).

Remember to make sure everything in input/ is linked with ln -s ../input/* ., and then run.

It worked! I see nc files (and data/meta files):

$ ls
PHrefC.data                      mitgcmuv*                        pickup.ckptA.002.002.meta
PHrefC.meta                      monitor.0000000000.t001.nc       pickup_cd.ckptA.001.001.data
PHrefF.data                      monitor_grid.0000000000.t001.nc  pickup_cd.ckptA.001.001.meta
PHrefF.meta                      phiHyd.0000000000.t001.nc        pickup_cd.ckptA.001.002.data
RhoRef.data                      phiHyd.0000000000.t002.nc        pickup_cd.ckptA.001.002.meta
RhoRef.meta                      phiHyd.0000000000.t003.nc        pickup_cd.ckptA.002.001.data
SSS.bin@                         phiHyd.0000000000.t004.nc        pickup_cd.ckptA.002.001.meta
SST.bin@                         phiHydLow.0000000000.t001.nc     pickup_cd.ckptA.002.002.data
STDERR.0000                      phiHydLow.0000000000.t002.nc     pickup_cd.ckptA.002.002.meta
data@                            phiHydLow.0000000000.t003.nc     salt.bin@
data.mnc@                        phiHydLow.0000000000.t004.nc     state.0000000000.t001.nc
data.pkg@                        pickup.ckptA.001.001.data        state.0000000000.t002.nc
eedata@                          pickup.ckptA.001.001.meta        state.0000000000.t003.nc
eedata.mth@                      pickup.ckptA.001.002.data        state.0000000000.t004.nc
grid.t001.nc                     pickup.ckptA.001.002.meta        theta.bin@
grid.t002.nc                     pickup.ckptA.002.001.data        topog.bin@
grid.t003.nc                     pickup.ckptA.002.001.meta        windx.bin@
grid.t004.nc                     pickup.ckptA.002.002.data        windy.bin@

Apparently the presence of the PHrefC.* files is a bug — Ruth sees those even in all the runs that she does.

The state* files are the netCDF files that contain the model variables. Note that there are 4 of them, because even though I wasn’t using MPI the SIZE.h for this example (exp2) did have nSx=2 and nSy=2 (number of processes per tile, which means that it ran multithreaded. What this means is that in order to look at the full model output you have to piece together the 4 files. There are python/matlab scripts to do this (somewhere), but probably no one has done it for R. Not sure if there’s good documentation on what to do to read these in generally.

To really control output I should be using the “diagnostics” package. Ruth will hopefully go over this with the group.

Note that I tried changing SIZE.h to have:

     &           nSx =   1,
     &           nSy =   1,

and when I recompile and try to run I get:

(PID.TID 0000.0001)  INI_PARMS ; starts to read PARM04
At line 4910 of file ini_parms.for (unit = 11, file = 'scratch1.000000000')
Fortran runtime error: Repeat count too large for namelist object dely

Error termination. Backtrace:
#0  0x1073d5147
#1  0x1073d5dbf
#2  0x1073d6743
#3  0x10749a387
#4  0x1074a2683
#5  0x1074a2937
#6  0x1048966d7
#7  0x1048a2a93
#8  0x1048bbdf3
#9  0x10483e9c3
#10  0x1048c3597

So … not sure what that’s about. Probably some other config parameter that I got wrong.

So now I go back to seeing if the mpi AND netcdf is working. I revert to the SIZE.h file configured for mpi, which has:

     &           nSx =   1,
     &           nSy =   2,
     &           nPx =   2,
     &           nPy =   1,

so should need two processors to run. I rebuild everything, then go into the run/ directory and do:

mpirun -np 2 ./mitgcmuv

which works! Runs on 2 processors and makes netcdf files (which I don’t really know how to read properly yet …).