Principle¶
The idea of DIRACOS is to rebuild all what we need from the system, and ship it all together as a big tarfile. The structure looks like a complete root filesystem.
- Create an empty repository
- Compile our own python and make an RPM out of it in this repository
- Rebuild all RPMs we need using this python
- Extract all the RPMs and the dependencies
- Install the python dependencies with pip
- Bundle everything as a tarball
How it works¶
This process relies mostly on Fedora mock
(https://github.com/rpm-software-management/mock/wiki) to build RPMs and
to create isolated environment. Python packages are installed using pip.
Bootstrap issue¶
There is a boostrap issue with mock: mock needs gdb which, when compiled with python support, needs python. The point is that we want to use python 2.7 (at the moment), which is different from the system python (2.6) used to compile gdb. Thus, to solve that, we need to:
- Compile python and put it in our repository
- Recompile gdb without python support, or with our python
- Use this gdb instead of the system one.
About EPEL repository¶
We could very well rely on the EPEL repository to provide most of the dependencies. The issue there is that it is very difficult to know if we are not rebuilding some of the dependencies of a given RPM, in which case one needs to recompile it as well. In order to avoid such headache, we just exclude EPEL all together, and recompile it all.
About links¶
Some RPMs or pip packages make use of links (symbolic or hard). Every symlink pointing outside of DIRACOS itself is removed by a copy, the others are left untouched. Hard links are always replaced by a copy, because some file system do not support them (in particular CVMFS, which will probably be the main mean of distribution)
About dependencies¶
Once the list of RPM and python packages are built, we pull all their
dependencies using the yum
resolution mechanism. However we need a
stop criteria. After a lot of testing, we decided that the best criteria
was to stop whenever we see the glibc
in the dependencies. Of
course, there are other packages, from which we could pull deeper. But
this did not give good results (too many useless things shipped), so we
decided to add the manualDependencies
functionality for such cases.
Dynamic libraries¶
When dynamic libraries are resolved by /lib64/ld-linux-x86-64.so.2
libraries are searched for using the following order:
- The
RPATH
dynamic section of the ELF file - The
LD_LIBRARY_PATH
environment variable - The
RUNPATH
dynamic section of the ELF file - The default locations specified in the interpreter itself
Ordinarily the CentOS 6 SPRMs upon which DIRACOS is based find shared
libraries by relying on the default locations (4). As DIRACOS has to be
relocatable this has to be overridden. Prior to DIRACOS v1r10 this was
achieved using the LD_LIBRARY_PATH
environment variable however this
interferes with the job payload in undesirable ways (see
DIRACGrid/DIRAC#4480).
DIRACOS v1r11 uses patchelf
to run a post processing step on the
built binaries to modify the headers of all dynamically linked ELF files
to add the RPATH
section. This is set to a path starting with
$ORIGIN/
to allow dependencies to be found relative to the current
files location rather than using an absolute path. This transformation
is performed by diracos/scriptTemplates/set_RPATH.py
which is called
by diracos/scriptTemplates/bundle_diracos_script_tpl.sh
.
Supported platforms¶
Only SLC6 and CC7 are supported. We have automated tests for CentOS 8, LTS Ubuntu and Fedora, but if they fail, too bad ! Up to you to fix it if you wish
Trick¶
If the tests are green, you can check gitlab-ci.yml to see what are the hooks we use to make the tests work on Fedora and Ubuntu. But again, use at your own risk, and do not ask for support.