Janis Lesinskis' Blog

Assorted ramblings

  • All entries
  • About me
  • Projects
  • Economics
  • Misc
  • Software-engineering
  • Sports

Dependency management in code files is not great


Lately I've been helping a friend get an old (more than a decade since the last commit) Ruby on Rails app updated to a more recent version. This reminds me of a project from late 2019 where I was helping a friend debug another old Rails app that was being used by a venture capital firm as part of a project to modernize and fix some bugs. My strategy in these situations is something like this:

  1. Get the current version working in a docker container with an old version of Ruby
  2. Update the version of Ruby
  3. Update the version of Rails by one major version
  4. Fix dependency issues
  5. Repeat steps 3/4 until the versions are up to date

There's been a LOT of problems with this especially with the dependency management issues. Something that I've seen while messing around with the gemfiles is the following:

group :development, :test do
  gem 'debase'
  gem 'byebug'
end

This looks like it's OKish if you are on the "happy path" where you are developing on a local machine and you aren't using containers and you are happy to install different code for development as production. But it turns out that this approach creates a rather large amount of friction because it makes it far harder for people to write dependency management tools. Because you have to execute the script to reliably know what the dependencies are it makes it far harder to create static analysis to determine what's actually going to be installed by any given package. As a result the tooling in these situations tends to be dramatically worse than it could otherwise be. The other thing is that if you then want to install with bundler into a docker image you'll end up with a Gemfile.lock which tracks what packages were installed, if Gemfile.lock is copied from the source directory you'll now have mismatches unless you start to mirror the logic in the Gemfile elsewhere in the install process. There's often a strong temptation to start mirroring the conditional logic in other places in the build pipeline and these can get out of sync.

The other issue is that I've had to rerun the docker image build over 50 times now since there's many issues that only show up after you've attempted to execute the install process and run the Rails app. This slow feedback loop is especially annoying since multiple errors could have been found in the one run if the dependency files were not executable, however because the Gemfile is Ruby code it has to be executed line by line. But if static analysis tooling was better I might not have had to have executed this at all because the version mismatches could have been analyzed by the tooling.

This reminds me of is a very similar problem that you get with Python's setup.py files, you just don't know what is actually going to happen with these files until you execute them. This is a major headache for any tooling that actually wants to do a scan of what the dependencies of packages are, since you might not want to just go execute arbitrary code you have fetched from the internet in order to find out what other packages need to be installed. This is why the Python community created a file format for specifying dependencies in PEP 508 and PEP 631

I think the main lesson that's been learned by many in the last decade is that its far better to specify dependencies in some structured format that isn't executable code files. This makes it much easier for other tooling to analyse the dependencies.

Published: Sun 03 January 2021
By Janis Lesinskis
In Software-engineering
Tags: python ruby rails gemfiles configuration-management dependency-management docker interpreted-languages scripting package-management

links

  • JaggedVerge

social

  • My GitHub page
  • LinkedIn

Proudly powered by Pelican, which takes great advantage of Python.