escapewindow: escape window (Default)
[personal profile] escapewindow

A few people have suggested I look at other packages for config solutions. I thought I'd record some of my thoughts on the matter. Let's look at requirements first.

Requirements

  1. Commandline argument support. When running scripts, it's much faster to specify some config via the commandline than always requiring a new config file for each config change.

  2. Default config value support. If a script assumes a value works for most cases, let's make it default, and allow for overriding those values in some way.

  3. Config file support. We need to be able to read in config from a file, and in some cases, several files. Some config values are either too long and unwieldy to pass via the commandline, and some config values contain characters that would be interpreted by the shell. Plus, the ability to use diff and version control on these files is invaluable.

  4. Multiple config file type support. json, yaml, etc.

  5. Adding the above three solutions together. The order should be: default config value -> config file -> commandline arguments. (The rightmost value of a configuration item wins.)

  6. Config definition and validation. Commandline options are constrained by the options that are defined, but config files can contain any number of arbitrary key/value pairs.

  7. The ability to add groups of commandline arguments together. Sometimes familes of scripts need a common set of commandline options, but also need the ability to add script-specific options. Sharing the common set allows for consistency.

  8. The ability to add config definitions together. Sometimes families of scripts need a common set of config items, but also need the ability to add script-specific config items.

  9. Locking and/or logging any changes to the config. Changing config during runtime can wreak havoc on the debugability of a script; locking or logging the config helps avoid or mitigate this.

  10. Python 3 support, and python 2.7 unicode support, preferably unicode-by-default.

  11. Standardized solution, preferably non-company and non-language specific.

  12. All-in-one solution, rather than having to use multiple solutions.

Packages and standards

argparse

Argparse is the standardized python commandline argument parser, which is why configman and scriptharness have wrapped it to add further functionality. Its main drawbacks are lack of config file support and limited validation.

  1. Commandline argument support: yes. That's what it's written for.

  2. Default config value support: yes, for commandline options.

  3. Config file support: no.

  4. multiple config file type support: no.

  5. Adding the above three solutions together: no. The default config value and the commandline arguments are placed in the same Namespace, and you have to use the parser.get_default() method to determine whether it's a default value or an explicitly set commandline option.

  6. Config definition and validation: limited. It only covers commandline option definition+validation, and there's the required flag but not a if foo is set, bar is required type validation. It's possible to roll your own, but that would be script-specific rather than part of the standard.

  7. Adding groups of commandline arguments together: yes. You can take multiple parsers and make them parent parsers of a child parser, if the parent parsers have specified add_help=False

  8. Adding config definitions together: limited, as above.

  9. The ability to lock/log changes to the config: no. argparse.Namespace will take changes silently.

  10. Python 3 + python 2.7 unicode support: yes.

  11. Standardized solution: yes, for python. No for other languages.

  12. All-in-one solution: no, for the above limitations.

configman

Configman is a tool written to deal with configuration in various forms, and adds the ability to transform configs from one type to another (e.g., commandline to ini file). It also adds the ability to block certain keys from being saved or output. Its argparse implementation is deeper than scriptharness' ConfigTemplate argparse abstraction.

Its main drawbacks for scriptharness usage appear to be lack of python 3 + py2-unicode-by-default support, and for being another non-standardized solution. I've given python3 porting two serious attempts, so far, and I've hit a wall on the dotdict __getattr__ hack working differently on python 3. My wip is here if someone else wants a stab at it.

  1. Commandline argument support: yes.

  2. Default config value support: yes.

  3. Config file support: yes.

  4. Multiple config file type support: yes.

  5. Adding the above three solutions together: not as far as I can tell, but since you're left with the ArgumentParser object, I imagine it'll be the same solution to wrap configman as argparse.

  6. Config definition and validation: yes.

  7. Adding groups of commandline arguments together: yes.

  8. Adding config definitions together: not sure, but seems plausible.

  9. The ability to lock/log changes to the config: no. configman.namespace.Namespace will take changes silently.

  10. Python 3 support: no. Python 2.7 unicode support: there are enough str() calls that it looks like unicode is a second class citizen at best.

  11. Standardized solution: no.

  12. All-in-one solution: no, for the above limitations.

docopt

Docopt simplifies the commandline argument definition and prettifies the help output. However, it's purely a commandline solution, and doesn't support adding groups of commandline options together, so it appears to be oriented towards relatively simple script configuration. It could potentially be added to json-schema definition and validation, as could the argparse-based commandline solutions, for an all-in-two solution. More on that below.

json-schema

This looks very promising for an overall config definition + validation schema. The main drawback, as far as I can see so far, is the lack of commandline argument support.

A commandline parser could generate a config object to validate against the schema. (Bonus points for writing a function to validate a parser against the schema before runtime.) However, this would require at least two definitions: one for the schema, one for the hopefully-compliant parser. Alternately, the schema could potentially be extended to support argparse settings for various items, at the expense of full standards compatiblity.

There's already a python jsonschema package.

  1. Commandline argument support: no.

  2. Default config value support: yes.

  3. Config file support: I don't think directly, but anything that can be converted to a dict can be validated.

  4. Multiple config file type support: no.

  5. Adding the above three solutions together: no.

  6. Config definition and validation: yes.

  7. Adding groups of commandline arguments together: no.

  8. Adding config definitions together: sure, you can add dicts together via update().

  9. The ability to lock/log changes to the config: no.

  10. Python 3 support: yes. Python 2.7 unicode support: I'd guess yes since it has python3 support.

  11. Standardized solution: yes, even cross-language.

  12. All-in-one solution: no, for the above limitations.

scriptharness 0.2.0 ConfigTemplate + LoggingDict or ReadOnlyDict

Scriptharness currently extends argparse and dict for its config. It checks off the most boxes in the requirements list currently. My biggest worry with the ConfigTemplate is that it isn't fully standardized, so people may be hesitant to port all of their configs to it.

An argparse/json-schema solution with enough glue code in between might be a good solution. I think ConfigTemplate is sufficiently close to that that adding jsonschema support shouldn't be too difficult, so I'm leaning in that direction right now. Configman has some nice behind the scenes and cross-file-type support, but the python3 and __getattr__ issues are currently blockers, and it seems like a lateral move in terms of standards.

An alternate solution may be BYOC. If the scriptharness Script takes a config object that you built from somewhere, and gives you tools that you can choose to use to build that config, that may allow for enough flexibility that people can use their preferred style of configuration in their scripts. The cost of that flexibility is familiarity between scriptharness scripts.

  1. Commandline argument support: yes.

  2. Default config value support: yes, both through argparse parsers and script initial_config.

  3. Config file support: yes. You can define multiple required config files, and multiple optional config files.

  4. Multiple config file type support: no. Mozharness had .py and .json. Scriptharness currently only supports json because I was a bit iffy about execfileing python again, and PyYAML doesn't always install cleanly everywhere. It's on the list to add more formats, though. We probably need at least one dynamic type of config file (e.g. python or yaml) or a config-file builder tool.

  5. Adding the above three solutions together: yes.

  6. Config definition and validation: yes.

  7. Adding groups of commandline arguments together: yes.

  8. Adding config definitions together: yes.

  9. The ability to lock/log changes to the config: yes. By default Scripts use LoggingDict that logs runtime changes; StrictScript uses a ReadOnlyDict (sams as mozharness) that prevents any changes after locking.

  10. Python 3 and python 2.7 unicode support: yes.

  11. Standardized solution: no. Extended/abstracted argparse + extended python dict.

  12. All-in-one solution: yes.

Corrections, additions, feedback?

As far as I can tell there is no perfect solution here. Thoughts?

February 2017

S M T W T F S
    1234
567891011
12131415161718
19202122232425
262728    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 27th, 2017 06:43 am
Powered by Dreamwidth Studios