Matthew Rosewarne on 9 Jan 2008 12:02:46 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Config Files (Was: Sharing an Internet Connection)


On Wednesday 09 January 2008, Stephen Gran wrote:
> > 1. Configuration files are extremely difficult to edit programmatically. 
> > Many applications get around this by simply writing "DO NOT EDIT!"
> > somewhere in the file or having an extremely limited syntax.  Placing
> > such restrictions on config files negates any benefit from being able to
> > edit them manually.
>
> This is not my experience - are you thinking of auto generated config
> files which have parsers that are fragile?

This is a widespread practise.  You can observe this sort of behaviour in 
Debconf, YAST, /usr/sbin/update-*, and all sorts of other applications.

> > 2. Configuration files are not very granular.  This makes it extremely
> > difficult to propagate changes to parts of a configuration without
> > overwriting the whole thing, which is an enormous obstacle to managing
> > multiple machines or editing a configuration programmatically.
>
> Many parsers now support the equivalent of include.  For those that
> don't, it's still fairly easy to target a change at a block of a file.

This is still often only useful for a user making limited changes to a file 
that is otherwise entirely controlled programmatically.  It is just not 
possible to make this system reliable for anything but the most basic of 
editing, such as what you'd see in Debian's /boot/grub/menu.lst 
or /etc/mailcap.

> > 3. When a user modifies a configuration file, the program is typically
> > not aware of the changes.  While this may be useful in some situations,
> > most of the time this would involve restarting the program and hoping the
> > changes don't cause it to barf.  In a high-uptime environment, this is
> > problematic. It also makes it far more difficult to have hooks run when
> > certain values are changed.
>
> This is the same no matter what your storage medium is for
> configuration.  The running software needs to notice that the
> configuration has changed, reread it, validate it, and start using it.
> It really doesn't matter how you store it on disk.

That's sort of my point, it shouldn't matter to the application how its 
configuration info is stored on disk, it just needs its info.

> > 4. Text files are slow and unwieldy for applications to read or write.  A
> > configuration file must often be parsed in its entirety in order for any
> > of its information to be extracted.  Similarly, they usually must be
> > written out fully for any changes to be saved.  While it might be
> > possible to get around this with regex magic, it still requires parsing
> > the entire file and is likely to break.
>
> Text files are actually very good - you can do linear seeks through the
> whole file, and do validation at the end.  If you are only interested in
> extracting tiny bits out of a monolithic file, some sort of indexed
> storage would be preferable, of course.  That sort of approach is
> usually unnecessary though - I actually want apache to reread it's
> entire configuration to make sure nothing else has changed.  the only
> time you need to extract one bit from a configuration store instead of
> the whole thing is when you store all programs configuration information
> in the same store.  In which case you've got the Windows registry, or
> you're sufficiently silly that you've reinvented it (gconf, I'm looking
> at you).

This is only true if your file is not particularly complicated, as in an INI 
file, without any nested structures.  When you make a pass with sed through a 
more complex config file, you have to hope that you aren't breaking it due to 
side-effects from something you didn't expect.  If you're doing this all by 
hand, it's not a big deal, other than the fact that it takes you a while.  If 
you're trying to make a program do this in an automated way, you have to take 
a more desperate approach, such as declaring certain parts of the config 
files as non-editable.

It's not really relevant that whether you're storing a whole system's 
configuration in one file.  It's difficult enough even to add a monitor or 
mouse to /etc/X11/xorg.conf without discarding the whole thing, which many X 
configuration apps do.  It's even harder to make some change that might 
require changing parts of several different configuration files, such as 
changing the port of a service, which would require updating the server 
configuration, firewall, forwarding, etc.

> > 5. Just about every configuration file has a different syntax.  Not only
> > does this mean that users must learn all sorts of different syntaxes to
> > manage their systems, but that a different parser program must be written
> > for every individual file.  Often this results in a scenario like that of
> > TeX, where there is only one parser implementation capable of fully
> > understanding the contents of a file.
>
> Is this actually important in practice?  Most programs are actually
> converging on several of the big configuration styles (apache, ini,
> etc).  The parser for the config file is presumably aready written by
> the time you start running the program, so this isn't actually a
> downside.  The only hard part is that users might need to get used to
> another format.  File a bug that they should use one of the existing
> standard formats, if that bothers you.

Rarely is there a parser available for most configuration files other than the 
one built into the application itself, something not easily used by an 
external program.  And while the syntax may look somewhat similar to a user 
for some configuration files, the details of actually understanding and 
editing them makes it largely impossible to make one parser for multiple 
different files.

> > There are other issues, but those are the most important ones I can think
> > of off the top of my mind.  I'm certainly not going to claim that an
> > approach like the accursed Windows Registry is superior, it's terrible. 
> > The best option I can think of would be a flexible configuration
> > _interface_ (by "interface" think "API" rather than "GUI") that could be
> > viewed/edited as a text file, or maybe an XML file, or even an LDAP tree,
> > with the actual storage being handled out of sight.  I would hope that
> > something such as the Elektra Initiative would come into fruition and
> > gain more widespread use, giving us Free software users the best of all
> > possible options.
>
> All of the options you're discussing mean that a running system's
> configuration depends on bigger and bigger layers of abstraction and
> additional support, making any given service more fragile.  It also mean
> more overhead for running a given service, making things like openmoko
> quite a bit harder.

Actually, it means using sharing functionality instead of reimplementing it 
over and over again for every individual application or config file.  If 
anything that makes a system far *more* robust and massively reduces 
overhead.  Think of the difference between using shared libraries vs. 
statically linking every single program on your machine, or perhaps even 
having every app write their own implementation of open() instead of using 
libc.

> Sorry to argue quite so strenuously, but I feel you really haven't made
> a case for any alternative, and I've now seen this same attitude crop up
> in a few places in the last few days.  I am curious what (aside from "I
> hate having to remember the syntax for $program") is motivating this.

I've come across many (well, not *too* many) Windows zealots who don't realise 
the strengths of config files and attempt to justify the horrors of the 
Windows Registry.  Clearly, I'm not in that camp, but I do recognise that 
editing configuration files are not the be-all-end-all configuration 
interface.

As for how we got here, I'd say it arose from a series of several tangents, 
though the original topic had something to do with routers (not really a 
topic I know much about).

> So, if you want to convince me that text files should be replaced with
> something else, go ahead.  This is what I think a good configuration
> system should be:
>
> does not rely on any network services (SQL, LDAP, etc)
> Human editable in case of disaster
> Support scripted roll out (no GUI only)

Well, it wouldn't make sense to make such a system that required SQL, LDAP, or 
some other heavy infrastructure.  We have enough experience nowadays with 
designing a core API that can allow pluggable implementations, for example 
the Linux VFS layer.  The idea would be to allow one to access the 
configuration data by whatever means they choose via pluggable frontends, be 
it an LDAP tree, XML file, or directory tree of text files with some sort of 
standard syntax, just like an application can open() a file regardless of 
whether the underlying filesystem is ext3, XFS, reiser, or whatever else.

> Does not require more library dependencies (embedded devices)

Having one implementation of reading/parsing/editing configuration data in one 
shared library would save quite a lot of space compared to each app having 
its own implementation.

> Must reliably keep state across restarts (cannot keep state in RAM, eg)

Not really much good if you can't reboot the thing, is it?

> And maybe I'm some others, but that's probably a good start.
> If you can suggest something better than text that meets those, lets
> talk about it.

Unfortunately, I'm not aware of any current system that is quite like I 
describe, though there are some that are going in the right direction:

* GConf is implemented in a fairly abominable way, with its own daemon and 
special editing tools, though it does have pluggable backends such as XML and 
SQL.

* KConfig is quite a good system, storing its information in an INI-like 
format, but using something called the "system configuration cache" (sycoca) 
to read the information into a temporary binary cache for a massive 
performance boost.  The newly-added backend for storing its info in an LDAP 
format (be it an LDAP server or Samba's LDB) could become seriously useful 
for centrally-managing configuration for a whole network of machines, though 
clearly it won't be used much outside of KDE.

* I haven't looked too deeply into Elektra yet, though I certainly appreciate 
their idea of making a consistentconfiguration interface spec and lightweight 
API.

Attachment: signature.asc
Description: This is a digitally signed message part.

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug